The Living Set

Mathematical and computational approaches are making strides in understanding how life might have emerged and organized itself from the basic chemistry of early Earth.

Jun 1, 2015
Wim Hordijk


Life is a self-sustaining network of chemical reactions. A living system produces its own components from basic food sources in such a way that these components maintain and regulate the very chemical network that produced them. Based on this notion of life, several models of minimal living systems were developed during the 1970s. While these models captured an essential aspect of the organization of living things, however, they could not directly explain how such systems emerged from a primordial soup of basic chemicals.

Over the past several years, one of these early models—that of autocatalytic sets—has been explored in more detail, both mathematically and with computer simulations. Autocatalytic sets are self-sustaining networks of chemical reactions that create and are catalyzed by components of the system itself. Recent research has overturned early criticisms regarding the plausibility of the spontaneous origin of such networks, and scientists have even applied the theoretical concepts to real chemical and biological systems, yielding important insights regarding the possible emergence, structure, and evolution of such systems. While many studying the origin of life are still focused on finding a self-replicating RNA molecule that could have served as the basis for modern life (see “RNA World 2.0,” The Scientist, March 2014), some now consider autocatalytic sets necessary conditions for its start.

The organization principle

Put some E. coli in a dish with appropriate nutrients, and after a few days the dish will be teeming with new bacterial offspring. But break down those same E. coli into their constituent molecules, add that molecular cocktail to a dish of nutrients, and nothing will happen. On the other hand, dried fertilized eggs from the common brine shrimp (Artemia) can be frozen in liquid helium at 2 °K (near absolute zero), and then slowly warmed back to life. As biophysicists Arthur Skoultchi and Harold Morowitz demonstrated 50 years ago, eggs that returned to room temperature hatched and grew into healthy adults, which mated and laid viable eggs.

What’s the difference between these two scenarios? In Skoultchi and Morowitz’s experiment, even though the storage conditions were extreme, the brine shrimp’s organized network of chemical pathways remained intact, allowing it to be revived. Life is clearly more than the sum of its parts.

The importance of organization to a system’s function supports the idea that a living system can be defined as a functionally closed and self-sustaining chemical reaction network. Functionally closed means that the system’s own components are sufficient to fully implement and regulate its functionality—whatever it does to “make a living.” A functionally closed system is thus a self-regulating system. Self-sustaining means that the system’s own functionality is, in turn, sufficient to construct and repair its components from a basic food set available from the environment. This can include both naturally occurring substances—such as water, air, and minerals—and compounds that are reliably produced by other nearby systems.

A bacterium, such as E. coli, is both functionally closed and self-sustaining. Through the chemical reactions that make up its metabolic network, it can produce its own components from a mixture of glucose and various salts containing phosphate, potassium, and ammonium. These self-constructed components in turn determine and regulate its metabolic network. For example, one particular component, the cell membrane, encloses all the other components (molecules and organelles) in a confined space so that the required metabolism can occur. A bacterium is, by anyone’s definition, a living system.

A virus, on the other hand, is not functionally closed when it comes to reproduction. For that, it needs to “borrow” some of the functionality of a host cell. Most scientists do not consider a virus, by itself, to be alive.

Models of minimal life

In the early 1970s, several researchers used this notion of life as a functionally closed and self-sustaining system to develop formal models of what might constitute a minimal living system. These models include the hypercycle (by Nobel laureate Manfred Eigen in Germany), autopoietic systems (by Francisco Varela and Humberto Maturana in Chile), the chemoton model (by Tibor Gánti in Hungary), and autocatalytic sets (by Stuart Kauffman in the U.S.).

Although the mathematical properties of hypercycles—collections of self-replicating molecules in which each molecule “helps” the next one replicate, with the last one helping the first one—have been worked out in detail, to date no example of a real hypercycle exists in a living organism or in one constructed in the laboratory. Similarly, the notion of autopoiesis, a self-sufficient system based on cellular life as we know it today, including a membrane and a metabolism, has remained mostly at a conceptual level. Chemotons add to autopoiesis the idea of a simple genetic system, and while they are mathematically well understood, they represent a rather complex system that most scientists would not consider a suitable model for the origin of life.

Autocatalytic sets are, in a way, somewhere in between simple hypercycles on the one hand and the more complex autopoiesis and chemotons on the other. As the name implies, the notion of catalysis plays a central role in this model. Catalysts—molecules that speed the rate at which chemical reactions happen, without being used up in the reaction—are essential in determining and regulating the functionality of the chemical networks that give rise to and sustain life.

An autocatalytic set is defined as a collection of chemical reactions and their molecules in which: (1) each reaction in the set is catalyzed by at least one of the molecules from the set, and (2) each molecule in the set can be produced from a basic food source through a series of reactions performed within the autocatalytic set. This definition formally captures the organization of life as a functionally closed and self-sustaining chemical reaction network. But can it also say something about the origin of life?

Autocatalytic sets in theory and practice

To explain his idea of autocatalytic sets more clearly, Stuart Kauffman developed a simple model of chemical reaction systems, known as the binary polymer model. In this model, molecules are represented by sequences of zeros and ones, or bit strings. Two possible chemical reactions can occur: ligation, the joining of two bit strings together into a longer one, such as 00 + 111→00111, and cleavage, or the cutting of a bit string into two shorter ones, such as 010101→0101 + 01. Furthermore, each bit string is able to catalyze some of the ligation and cleavage reactions, but these catalytic abilities are assigned randomly. In this way, many different iterations of the model can be constructed, where each gives rise to a different reaction network. Kauffman then asked what the probability is that a (random) instance of his model contains a subset of reactions and molecules that together form a self-sustaining autocatalytic set.

Autocatalytic sets can be studied mathematically and with computer simulations, but they also show up in experimental systems of nucleic acids and in the metabolic network of living organisms.

Using his binary polymer model, Kauffman constructed a mathematical argument to show that the formation of autocatalytic sets is an expected emergent property of sufficiently complex collections of molecules. In other words, if a chemical reaction network has a large enough diversity of molecules, and if these molecules catalyze a large enough number of reactions, then we can expect there to exist a subset of molecules and reactions that together form a functionally closed and self-sustaining network.1,2

Clearly, this would have important implications for the origin of life. According to Kauffman, in a diverse enough collection of molecules and reactions, life will almost certainly emerge. For example, Kauffman explained his ideas explicitly in terms of autocatalytic sets of proteins.1 We have known since the famous experiments performed by Stanley Miller during the 1950s that various amino acids form spontaneously in a “primordial broth” subjected to electrical sparks, perhaps lightning. Amino acids form peptides and small proteins quite easily, and peptides are known to have catalytic capabilities. So, it is not unthinkable that autocatalytic sets of peptides and small proteins may have formed spontaneously on early Earth.

But these ideas have faced their share of criticism. For example, Kauffman’s mathematical argument may require a level of catalysis that is chemically unrealistic. Many molecules can catalyze more than one reaction. However, with increasing system sizes, Kauffman’s argument may require each molecule to catalyze dozens or even hundreds of reactions.

Moreover, autocatalytic sets have been criticized for lacking the ability to evolve. In Kauffman’s argument, autocatalytic sets show up as “giant connected components,” incorporating almost the entire reaction network. This leaves little room for change and adaptation. Autocatalytic sets, if they do in fact emerge, are thus not able to evolve into more-complex forms, making them biologically uninteresting.

Also, the binary polymer model has been accused of being too simplistic, with abstract bit strings representing real molecules, and even the smallest molecules (such as monomers and dimers) being able to catalyze arbitrary reactions. Because Kauffman’s argument is entirely based on this simple model, it is unclear whether his results generalize to real chemical reaction networks.

But despite these criticisms, examples of real chemical autocatalytic sets do exist. Günter von Kiedrowski, now at the Ruhr-Universität in Bochum, Germany, and his colleagues constructed the first artificial autocatalytic set in 1994, using two short nucleic acid sequences that mutually catalyze each other’s formation from even shorter fragments.3 Gerald Joyce, of the Scripps Research Institute in La Jolla, California, obtained similar results in 2009.5 In this case, pairs of slightly longer nucleic acids were artificially evolved to mutually catalyze each other’s formation with increasing efficiency.

More-complex autocatalytic sets have also been assembled. In 2004, Joyce’s colleague at Scripps, Reza Ghadiri, and other collaborators created an autocatalytic set of nine peptides.6 And in 2012, Niles Lehman of Portland State University built an autocatalytic set of long nucleic acids formed spontaneously from shorter fragments.7 In Lehman’s system, a larger and larger autocatalytic set emerged over time, until eventually it contained all 16 possible molecule types.

These experiments demonstrate the plausibility of autocatalytic sets, but whether or not such self-sustaining systems served as the basis of the origin of life on Earth remains a matter of debate.

The emergence and structure of autocatalytic sets

SELF-SUSTAINING REACTION NETWORKS: The underlying tenet of autocatalytic sets is that a living system is functionally closed and self-sustaining in the presence of a food source. On early Earth, the initial basic ingredients, known as a food set, would have included molecules and elements such as water, hydrogen, nitrogen, carbon dioxide, and iron. These may have spontaneously formed more-complex molecules like the monomers of nucleic acids and proteins. And in more-complex systems—for example, a modern bacterium such as E. coli—the food set becomes even larger. In each case, the food set is all that is needed for the system to run: it provides the basic building blocks for the system’s component parts. The notion of catalysis plays a central role in this model. Catalysts, which speed chemical reactions but are not themselves used up, are critical to the chemical networks that give rise to and sustain life.
See full infographic: JPG
Over the past several years, my colleague Mike Steel of the University of Canterbury in New Zealand and I have studied autocatalytic sets in more detail, both mathematically and with computer simulations. This has led to new insights into the emergence, structure, and possible further evolution of autocatalytic sets. Moreover, these new results counter many earlier criticisms.8

We defined Kauffman’s original notion of autocatalytic sets in a mathematically more rigorous way. Then we developed an efficient computer algorithm to detect autocatalytic sets in arbitrary chemical reaction networks. We applied this algorithm to random instances of the binary polymer model and derived various mathematical theorems about autocatalytic sets.

After we had run our simulations on a large computer cluster for several weeks, a clear picture emerged. Autocatalytic sets do indeed have a high probability of existence, even for very moderate levels of catalysis. In fact, in instances of the binary polymer model with bit strings up to 20 characters long, each bit string only needs to catalyze one to two reactions, on average, for autocatalytic sets to exist, making this a highly plausible chemical scenario. Additionally, the required level of catalysis increases only very slowly for larger chemical reaction networks, an observation that was subsequently confirmed theoretically. For example, even for bit strings up to 50 characters long, no more than two reactions need to be catalyzed per molecule, on average.

We also looked at the structure of the autocatalytic sets our algorithm identified. Contrary to Kauffman’s original argument that autocatalytic sets emerge as “giant connected components,” it turns out that autocatalytic sets can often be decomposed into smaller subsets, which themselves are autocatalytic. (See illustration here.) In fact, there often exists an entire hierarchy of smaller and smaller autocatalytic subsets. The smallest autocatalytic sets, which cannot be decomposed any further, are called irreducible autocatalytic sets.

In recent groundbreaking work, a group of researchers including Kauffman and Hungarian theoretical evolutionary biologist Eörs Szathmáry convincingly showed that autocatalytic sets composed of multiple small, irreducible subsets can, in fact, evolve.9 The main idea is that these autocatalytic subsets can exist in different combinations within a compartment (a protocell), thus giving rise to different types of protocells, and, consequently, to competition and selection. This, combined with our own results that one can indeed expect many such irreducible autocatalytic subsets to exist within a reaction network, suggests that autocatalytic sets are likely to arise from sufficiently complex chemical reaction networks and go on to evolve into larger and more complex systems.

We also studied several variations of the standard binary polymer model, incorporating more-realistic assumptions regarding the system’s chemistry. For example, rather than assigning catalysts completely randomly, we imposed the constraint that a bit string needs to match a certain number of bits around the ligation or cleavage site in a reaction to be able to catalyze that reaction, simulating catalysis by temporary base pair formation. We also considered a version of the model in which molecules are partitioned into two separate sets (such as nucleic acids and amino acids), where reactions can only happen between molecules of the same set, but catalysis can happen both within and between sets. (See illustration here.) Overall, the main results hold up under all of these assumptions.

MODELING MULTIPLE MOLECULES: The starting ingredients for life may have included not only RNA nucleotides (purple circles and gray squares) and molecules, but also amino acids (green triangles and pink hexagons) and proteins. In this model, each molecule type can react with like molecules, but RNA molecules cannot react with proteins. Either molecule type can, however, catalyze reactions of the other type.J.I. SMITH ET AL., “AUTOCATALYTIC SETS IN A PARTITIONED BIOCHEMICAL NETWORK," JOURNAL OF SYSTEMS CHEMISTRY, 5:2, 2014.In order to move away entirely from abstract models, we also successfully applied our formal autocatalytic sets framework to real chemical and biological networks, such as the 16-member autocatalytic set that was constructed in the Lehman lab.10 Not only were we able to reproduce most of their experimental results, we also derived new insights and predictions that would have been very difficult to obtain from chemical experiments alone. For example, our simulations show that with each repetition of the experiment, a different pattern of larger and larger autocatalytic sets emerges, revealing the rich hierarchical structure of autocatalytic subsets that exists within this system. Furthermore, the model provides a clear explanation for the experimental observation that a “cooperative” system (autocatalytic set) often outcompetes an equivalent but “selfish” system in which each molecule only catalyzes its own production. It also shows why and how such a cooperative system is more robust against environmental perturbations.

With molecular biologists Bill Martin and Filipa Sousa of the Heinrich-Heine-Universität in Düsseldorf, Germany, we analyzed the entire metabolic network of E. coli with our formal framework.11 Not surprisingly, we found that 98 percent of the reactions in this metabolic network together form an autocatalytic set. The resulting set is quite robust to environmental variations, such as different food sources or the removal of random reactions or molecules. This reflects the redundancy that is known to exist in E. coli’s metabolism, and the bacterium’s ability to live on different food stocks or to produce certain important molecules through multiple alternative chemical pathways. Our results also show a modularity in the network that corresponds well with known functional categories of metabolic reactions. In particular, the computational results underscore the crucial role of cofactors, such as various metals and molecules including ATP, as prime mediators of metabolism. Some of these cofactors may very well have been responsible for generating the very first autocatalytic sets at the origin of life.

If a living system is indeed an autocat­alytic set, then the next question to ask is whether an ecosystem, a network of inter­dependent organisms, can be considered an autocatalytic superset of autocatalytic sub­sets.

In light of all these results, it seems that the main criticisms against the plausibility and evolvability of autocatalytic sets have now been largely resolved. It took more than 40 years, and the efforts of many scientists, to get from the initial ideas to the latest findings. But the gap between theory and experiments is finally closing.

Autocatalytic sets beyond chemistry

As this recent research has shown, autocatalytic sets capture essential aspects of the organization of living organisms, and their high probability of emergence and potential to evolve have important implications for the origin of life. Autocatalytic sets can be studied mathematically and with computer simulations, but they also show up in experimental systems of nucleic acids and in the metabolic network of living organisms. They truly seem to represent a fundamental property of life.

If a living system is indeed an autocatalytic set, then the next question to ask is whether an ecosystem, a network of interdependent organisms, can be considered an autocatalytic superset of autocatalytic subsets. Or, to take it even further, what about social systems such as the economy? Economic production—converting raw materials into consumer goods—can be compared to chemical reactions, with produced goods from hammers to conveyor belts to the Internet serving as the catalysts to facilitate yet other economic productions.

The possibilities are exciting and seemingly endless. And we now have the mathematical and computational tools to study and develop these ideas more formally and extensively. Indeed, several ecologists, economists, and cognitive and social scientists are interested in applying these ideas and tools in their own areas of research. The journey of investigations into autocatalytic sets, and the origin and organization of life itself, continues today as an international and interdisciplinary scientific collaboration.

Wim Hordijk is an independent computer scientist working in the areas of computational biology and origin of life. More information about his collaborative research on autocatalytic sets can be found on the website


  1. S. Kauffman, “Autocatalytic sets of proteins,” J Theor Biol, 119:1-24, 1986.
  2. S. Kauffman, At Home in the Universe. Oxford University Press, 1995.
  3. D. Sievers, G. von Kiedrowski, “Self-replication of complementary nucleotide-based oligomers,” Nature, 369:221-24, 1994.
  4. V. Patzke, G. von Kiedrowski, “Self replicating systems,” ARKIVOC, 5:293-310, 2007.
  5. T.A. Lincoln, G.F. Joyce, “Self-sustained replication of an RNA enzyme,” Science, 323: 1229-32, 2009.
  6. G. Ashkenasy et al., “Design of a directed molecular network,” PNAS, 101:10872-77, 2004.
  7. N. Vaidya et al., “Spontaneous network formation among cooperative RNA replicators,” Nature, 491:72-77, 2012.
  8. W. Hordijk, “Autocatalytic sets: From the origin of life to the economy,” BioScience, 63:877-81, 2013.
  9. V. Vasas et al., “Evolution before genes,” Biology Direct, 7:1, 2012.
  10. W. Hordijk, M. Steel, “A formal model of autocatalytic sets emerging in an RNA replicator system,” J Syst Chem, 4:3, 2013.
  11. F.L. Sousa et al., “Autocatalytic sets in E. coli metabolism,” J Syst Chem, 6:4, 2015.