Ancient Life in the Information Age

What can bioinformatics and systems biology tell us about the ancestor of all living things?

Mar 1, 2014
Aaron David Goldman

GETTING TO THE ROOT OF THINGS: Charles Darwin drew this famous tree on page 36 in his Notebook B (1837-38) to illustrate his ideas about the relationship between living (branches with perpendicular tips) and extinct (branches with no tip) organisms that descended from a common ancestor (circled 1 at base of tree).© CAMBRIDGE UNIVERSITY LIBRARYAll known organisms share a number of fundamental features that, taken together, point to a common evolutionary history: DNA as the chief molecule of genetic inheritance, proteins as the primary functional molecules, and RNA as an informational intermediate between the two. The simplest explanation for why organisms share these common features is that they are inherited from a last universal common ancestor (LUCA), which sits at the root of the tree of life. Most studies of gene duplications that occurred prior to the first branch on the tree place LUCA in between the Bacteria and the common ancestor of the Archaea and Eukarya, the three taxonomic domains of cellular life.

The availability of the genome sequences from so many species across the tree of life has made it possible to look for common genomic traits that were most likely inherited from LUCA. The methods employed to identify these common genomic traits can vary greatly, however, and as a result lead to very different predictions. Some studies have estimated there to be fewer than 100 LUCA-derived gene families, while others count more than 1,000, depending on how conservatively the methods rule out genes on suspicion of horizontal gene transfer or how liberally they include genes that appear to have been present in LUCA, but subsequently lost. Despite the conflicting results, the new data are yielding insight into ancient life on Earth.  

The majority of ancient gene families identified in almost all of these studies are involved in the translation of genetic information into proteins. These ancient gene families represent a range of translation functions, from regulation to ribosomal components. The genetic code at the core of translation is also highly conserved across life. In all likelihood, the enzymes responsible for establishing the genetic code by attaching amino acids to particular tRNAs evolved prior to the time of LUCA, although their evolutionary histories are obscured by subsequent horizontal gene transfers between bacteria and archaea. These results depict a translation system in LUCA that was probably similar to and as sophisticated as those of organisms alive today.

In contrast, few genes involved in the synthesis of DNA are conserved across the tree of life. The enzymes responsible for making deoxyribonucleotides from ribonucleotides exist in three distinct families that only show a weak signature of common descent in their active sites. The only DNA polymerase enzymes that are common across the evolutionary tree are those involved in repair, not the polymerases presently responsible for copying complete chromosomes. RNA polymerases from bacteria, archaea, and eukaryotes, on the other hand, do appear to have been inherited from LUCA, and may have previously functioned as DNA polymerases as well. Taken together, these observations suggest that DNA genomes replaced a genome composed of RNA just prior to or perhaps just after the time of LUCA.

The variety of metabolic strategies observed in modern organisms demonstrates that metabolism is generally less highly conserved, which makes it harder to identify those metabolic pathways that were present in LUCA. Still, various databases organize enzymatic data into metabolic maps, which can be used to uncover highly conserved components of modern metabolic pathways. For example, a recent study combined these data with evolutionary trees of carbon-fixation genes and found that the ancestral carbon-fixation pathway was most likely an amalgam of components currently found in two separate pathways in extant archaea and bacteria: the reductive acetyl-CoA pathway and the reductive citric acid cycle (PLOS Comput Biol, 8:e1002455, 2012). Another taxonomically broad comparison study, focused on amino acid metabolism, uncovered conserved biosynthetic pathways for 8 of the 20 canonical amino acids, and conserved enzymes from pathways for another eight (Genome Biology, 9:R95, 2008).

Finally, LUCA most likely had a phospholipid membrane that set the boundaries between organisms and offered protection from the external environment. The universal presence of genes responsible for targeting proteins to membranes suggests that LUCA’s membrane was replete with proteins. Furthermore, the ubiquity of both catalytic subunits of the membrane-bound ATPase motor also implies that this membrane was impermeable enough to ions that it could be used to generate the proton gradients used by the motor to convert ADP to ATP.

While this detailed understanding of LUCA is relatively recent, Darwin proposed the idea of an early common ancestor to all life in the first edition of Origin of Species, where he wrote, “Therefore I should infer from analogy that probably all the organic beings which have ever lived on this earth have descended from some one primordial form, into which life was first breathed.” Although Darwin’s insight is brilliant for its time, the modern view shows that LUCA is not this “primordial form,” but rather a sophisticated cellular organism that, if alive today, would probably be difficult to distinguish from other extant bacteria or archaea. This means that a great detail of evolution must have taken place between the time of the origin of life and the appearance of LUCA. Continuing advances in evolutionary biology, bioinformatics, and computational biology will give us the tools to describe LUCA and the evolutionary transitions preceding it with unprecedented accuracy and detail.

Aaron David Goldman is an assistant professor of biology at Oberlin College. His research employs bioinformatics and systems biology tools to study the genome and metabolism of LUCA and their connections to evolutionary predecessors.
 

Suggested Reading

A. Becerra et al., "The very early stages of biological evolution and the nature of the last common ancestor of the three major cell domains," Annu Rev Ecol Evol Syst, 8:361-79, 2007.

R. Braakman, E. Smith, "The emergence and early evolution of biological carbon-fixation," PLoS Comput Biol, 8:e1002455, 2012.

P. Forterre, "The origin of DNA genomes and DNA replication proteins," Curr Opin Microbiol, 5:525-32, 2002.

S.J. Freeland et al., "Do proteins predate DNA?" Science, 286:690-92, 1999.

A.D. Goldman, et al., "LUCApedia: a database for the study of ancient life," Nucleic Acids Res, 41:D1079-82, 2013.

A.D. Goldman, L.F. Landweber, "Oxytricha as a modern analog of ancient genome evolution," Trends Genet, 28:382-88, 2012.

A.D, Goldman et al., "The evolution and functional repertoire of translation proteins following the origin of life," Biol Direct, 5:15, 2010.

R.D. Knight et al., "Rewiring the keyboard: evolvability of the genetic code," Nat Rev Genet, 2:49-58, 2001.

J.M. Kollman, R.F. Doolittle, "Determining the relative rates of change for prokaryotic and eukaryotic proteins with anciently duplicated paralogs," J Mol Evol, 51:173-81, 2000.

D. Theobald, "A formal test of the theory of universal common ancestry," Nature, 465:219-22, 2010.

C. Woese, "The universal ancestor," PNAS, 1998;95:6854-59, 1998.