Once upon a time, between 2 million and 4 million years ago, the fruit fly lineage split. One sister species, known today as Drosophila simulans, kept up the parent’s habit of hanging around and consuming ripe fruit. The other, the D. melanogaster beloved by modern geneticists, took a different path. It evolved a more active alcohol dehydrogenase (ADH) enzyme, making it better suited to slurp the high-ethanol content of perfectly rotten fruit.

It’s a nice story, one that evolutionary biologists touted for decades as an illustrative example of molecular adaptation. Too bad it’s wrong.

Last year, researchers at the University of Chicago refuted that just-so scenario by resurrecting the ancient ADH enzyme from the last common ancestor of D. simulans and D. melanogaster. They replaced a modern D. melanogaster’s ADH with the prehistoric version of the protein, from before flies colonized rotten fruit, and it made no difference...

The scientists conducting that experiment used a computational technique called ancestral sequence reconstruction (ASR) to determine the amino acids that made up the ancient protein, thus allowing them to revive it. Researchers have been adopting the method over the past decade. “Using ancestral reconstruction, we can test those historical hypotheses directly,” says Mo Siddiq, a graduate student who led the Drosophila project in the laboratory of Joe Thornton. The team was the first to resurrect an ancient protein in a living animal. (With ADH ruled out, Siddiq now suspects that “a suite of changes” brought D. melanogaster to its boozy diet.)

At its heart, ASR addresses “the ‘why’ of proteins,” says Michael Harms, an evolutionary biologist at the University of Oregon. Why did proteins evolve the way they did? Was it through chance, or did some other factors push them in a particular direction? Some biologists also think resurrected proteins offer clues about the past—what the world was like when certain proteins first appeared on the evolutionary landscape, perhaps even when life first evolved some 3.8 billion years ago.

Other researchers are looking to ancient proteins for ideas about the future, taking inspiration from the ancient molecules’ forms and functions as they engineer enzymes with new, targeted purposes for industrial, agricultural, or medical use. Ancestral sequences often yield proteins that are highly stable and flexible in function. “They can make a very good starting point for bioengineering,” says José Sánchez-Ruiz, a physical chemist studying biomolecules at the University of Granada in Spain.

How to Resurrect a Protein

Ancestral sequence reconstruction relies on phylogeny and statistics to infer the most likely amino acid sequence for an ancient protein.
SEQUENCE ALIGNMENT: Scientists collect sequences from databanks of the modern versions of the protein of interest from different organisms.
wikimedia commons
TREE BUILDING: Computer algorithms construct a phylogenetic tree for the proteins (Curr Opin Struct Biol, 38:37–43, 2016).
Miguel Andrade

ANCESTRAL RECONSTRUCTION: The programs can then infer the sequences that likely existed at nodes of the tree, before the modern species evolved.
LABORATORY TESTS: Finally, the scientists order synthetic DNA and generate those proteins in the lab to use for experiments.
See full infographic: WEB | PDF
© istock.com/ttsz

Ancient proteins in modern organisms

University of Arizona evolutionary and synthetic biologist Betül Kacar is interested in basic questions about life’s origins. So she’s doing a version of what the late Stephen Jay Gould, a prominent evolutionary biologist and science communicator, referred to in his book Wonderful Life as “replaying life’s tape.”

Gould wondered: If you could rewind evolution to a given point, and let it proceed for a second time, would life evolve in a different manner, or arrive at the same endpoint? In other words, is the direction of evolution determined by chance, or is there a most likely sequence of events? It’s a question that intrigues not only evolutionary biologists gazing into the past, but also astrobiologists wondering how the process might unfold on other planets.

With funding from NASA, Kacar set out to replay the tape for one particular protein, EF-Tu. Found in all organisms, it delivers tRNAs with their amino acid cargoes to the ribosome. Several ancestral EF-Tu constructs had been developed in the laboratory of Kacar’s collaborator and former postdoc advisor, Eric Gaucher of the NASA Astrobiology Institute team at Georgia Institute of Technology in Atlanta. To go back in time to the ancient sequences, Gaucher, then with NASA’s team at the University of Florida, had started with amino acid sequences of EF-Tu from 50 modern bacteria. From there, he used computer algorithms to construct a phylogenetic tree for those proteins, and checked that it matched well with other phylogenies for the organisms involved. The ASR algorithms then used the family tree to infer the most likely amino acid sequences at the nodes of the tree, ranging from 3.5 billion to 500 million years ago.2,3

Kacar rewound 700 million years, reconstructing an EF-Tu likely to have been encoded by a proteobacterium. Then, like Siddiq, she set about creating a modern bacterium that used the extinct form of the enzyme. After a couple of years’ worth of cloning attempts, and thanks to “a good healthy level of obsession,” Kacar says, she managed to create an E. coli strain that did so. This forced the bacteria to work with an EF-Tu that only poorly interfaced with all the modern partner proteins it needed to bind. They weren’t the fittest bacteria, taking about 45 minutes to double their numbers, compared with 20 minutes for thoroughly modern E. coli, but they were viable.

Using ancestral reconstruction, we can test those historical hypotheses directly.

-Mo Siddiq, University of Chicago

Kacar divided those bacteria into six culture flasks, and grew them for about 2,000 generations. Over time, their fitness increased and they attained doubling rates of about 25 minutes. In results published last year, Kacar reported that five cultures played the same tune: they achieved this fitness level by boosting expression of the old gene for EF-Tu, which they did with the help of a mutated promoter.4 This contrasts with the modern EF-Tu, which—thanks to 700 million years of evolution—differs by 21 amino acids. This allows it to interface properly with the rest of the proteins in modern E. coli, thus achieving quick cell division rates. Although the resurrected EF-Tu was far from optimal for modern E. coli, providing enough suboptimal EF-Tu seemed to give the protein-making machinery what it needed to grow at a reasonable rate. The researchers confirmed that artificially overexpressing the ancient gene for EF-Tu also improved bacterial fitness.

Bacteria in the sixth flask accumulated several mutations throughout their genomes, and Kacar suspects some of those altered the gene network that controls the EF-Tu gene’s transcription, to increase it and improve fitness. So perhaps evolution often takes the most likely route—at least with this one protein.

Most researchers have only studied ASR-derived proteins in vitro; analyzing the ancient proteins in the context of a modern organism is a relatively new application. But the in vivo approach is quickly gaining steam. In addition to Kacar’s and Siddiq’s experiments, for example, Sánchez-Ruiz and colleagues recently resurrected an ancient gene for thioredoxin and inserted it into modern E. coli. Thioredoxin is a redox protein that donates electrons to diverse enzymes, but it also serves as a key factor that allows bacteriophages to propagate in host microbes. When the researchers replaced the modern thioredoxin with versions from 2 million to 4 million years ago, the E. coli were unable to support viral propagation.5 It’s not that the old enzyme was better or worse—it’s simply that it’s not compatible with the modern virus. That mismatch protects the bacteria.

Thus, in just a couple of decades, rewinding evolution using ASR proteins has shed light not only on how the macromolecules changed through the eons, but also how they function in vivo.

Ancient proteins interrogate origins of life

Some scientists also believe ASR offers insights into the habitat preferences of ancient life forms. Sánchez-Ruiz’s ancient thioredoxin, as well as Gaucher’s ancestral EF-Tu, fit into a common trend among ASR proteins from more than 2 billion years ago: they seem to have liked it hot—about 30 °C to 40 °C  hotter than proteins from the last 1 billion to 2 billion years, if the difference between modern and ancient melting points is any indication. The trend has led researchers to speculate that early life lived, and perhaps originated, in hot springs, or in the sweltering oceans geochemists believe existed early in Earth’s timeline, up until around 3.2 billion years ago or so. Or, it may be that life evolved at a variety of temperatures, but only the most thermostable early organisms survived asteroid bombardment that could have boiled the oceans during the first 700 million-plus years of the planet’s history.

Dan Tawfik, an evolutionary biochemist at the Weizmann Institute of Science in Israel, says he was initially quite enthusiastic about the idea that heat-stable ASR proteins indicate a hot origin for life. But doubts crept in. He spoke with Weizmann geochemist Itay Halevy and discovered that the hot-ocean hypothesis isn’t universally accepted. Halevy says that the sun was 25 percent dimmer 3.5 billion years ago, and the geological record indicates glaciation during Earth’s early history, which was unlikely to happen if the ocean’s average temperature was 70 °C or higher. And in his recent analyses of resurrected liver proteins from the common ancestor of mammals, Tawfik found additional reasons to doubt the idea.

Tawfik and colleagues reconstructed a mammalian serum paraoxonase (PON)—a protein that hydrolyzes molecules called lactones and organophosphates, detoxifying them—dating to 65 million to 100 million years ago.7 Its melting temperature was 13 °C higher than that of a modern PON.8 Yet the Earth’s temperature back then resembled today’s. And since mammals can regulate their own temperatures, it would be unnecessary to match their enzyme’s ideal temperatures to the environment. So why would Tawfik’s PON, plucked from the relatively recent past, exhibit such high thermostability?

One possible explanation could be that the ancient proteins required extra stability for other reasons. For example, Tawfik speculates that perhaps protein translation wasn’t all that accurate in the past. If the proteins had hyperstable amino acid sequences, he posits, they’d be better suited to cope with a high rate of translational errors. In that case, thermostability would have evolved as a side effect.

Or, the thermostability may be an artifact of the reconstruction process. After all, ASR is based on a computer program’s best guess at an ancestral protein. Some researchers suspect that the ASR sequences, which are inferred based on the family tree of a given protein, are little different from the consensus sequences that are obtained by averaging extant strings of amino acids, without taking evolution into account. And those consensus sequences also tend to be thermostable, says Harms, though he’s not sure why that happens. “It’s an approximate method,” says Harms. “I don’t think we’ve run all the controls that we would need, as a field, to conclude that this isn’t an artifact.”

Others are more confident of ASR-based conclusions. Consensus and ancestral sequences often do resemble each other, but that doesn’t make the ASR-based deductions wrong, says Sánchez-Ruiz. And consensus doesn’t always mean stability. For example, Sánchez-Ruiz used modern bacterial sequences to generate ASR beta-lactamases, enzymes that degrade antibiotics, and they were quite thermostable. But when he generated a trio of consensus variants, only one was particularly stable.10 “Consensus proteins sometimes show enhanced stability, likely because they capture some of the ancestral stability,” Sánchez-Ruiz says.

Ensuring Accuracy

One way to ensure that an ASR protein behaves like the true ancestor is to resurrect and test not only the best amino acid sequence generated by the algorithms, but a few proteins with the second-best guesses, or third-best guesses, and so on. If those alternative ancestors act like the best-guess version, then researchers figure the conclusions are probably robust. Recently, evolutionary synthetic biologist Eric Gaucher of Georgia State University tested ASR accuracy in a different way. He generated an entirely artificial phylogenetic tree, starting with red fluorescent protein and randomly mutating it to evolve 19 diversely colored fluorescent proteins. Then he used ASR to predict the ancestor of those 19 descendants, and compared the results to the true ancestors. The results were reassuring. Overall, the five different ASR algorithms he tried identified the ancestral sequence with about 97 percent accuracy (Nat Commun, 5:12847, 2016).
EVOLVING PROTEINS: The experimental evolution began with a red fluorescent protein gene (left). The 19 resulting proteins were sequenced, and the data were used to infer the sequences of the node proteins. (Colors represent protein fluorescence. The number of nonsynonymous and synonymous substitutions are shown along each branch.)
See full infographic: WEB | PDF
Nat Commun, 5:12847, 2016

Ancient proteins help bioengineers design new ones

No matter how or why ASR proteins are so thermostable, the trait is good news for bioengineers aiming to create new, application-ready proteins, says Tawfik. “Who cares why it works?” he says. “If we want to engineer an enzyme, we hardly ever start with an E. coli or human enzyme, we typically infer the ancestor and use this as a starting point.”

Plus, ASR proteins are often promiscuous, binding to or acting on a greater array of partner molecules than their modern counterparts. Sánchez-Ruiz observed this when he analyzed the ancient beta-lactamases. While a modern beta-lactamase specializes in disarming penicillin, the enzymes from 2 billion or 3 billion years ago degrade a broader spectrum of antibiotics.11 “The modern protein is the outcome of 4 billion years of evolution, and it is highly specialized,” he says.

Harms and other researchers have argued that the evidence for a consistent trend, from protein generalists to specialists, has yet to be established.9 “I think that ancestral proteins were just as optimal as modern proteins for their environment,” Harms says.

But once again, whether ASRs accurately reflect proteins’ histories or not, the fact that they’re often generalists is a boon for bioengineers. “For a new function to evolve, it has to be there to start with, as something latent,” explains Tawfik. ASR proteins may serve as jacks-of-all-trades, giving protein engineers a better starting point.

ASRs can make a very good starting point for bioen­gineering.

—José Sánchez-Ruiz, University of Granada

To test that idea, Sánchez-Ruiz, Gaucher, and collaborators are planning a high-throughput experiment. Starting with an ancient enzyme, they hope to nudge its evolution toward desired functions. One test case they’re aiming at is the Diels-Alder reaction, a chemical process for building molecular rings. No enzyme that can accomplish this task has ever evolved in nature, though bioengineers have managed to design synthetic proteins that can. If the collaborators can evolve a Diels-Alder enzyme from an ancient precursor in the lab, it will be a good indicator that they can do so for other types of reactions.

One application for such reconstructed enzymes might be in agriculture. After using the ancient version of thioredoxin to engineer disease-resistant E. coli, Sánchez-Ruiz’s group suggested that a similar approach in plants could protect crops from pathogens.5 “I hope someone with expertise in plant bioengineering gives it a shot,” he says.

The Perfect Starting Point

Bioengineers love resurrected proteins because they often combine two desirable features: thermostability and promiscuity. For example, researchers at the University of Granada in Spain reconstructed several versions of an antibiotic-resistance protein called beta-lactamase, going back as far as 3 million years. As the protein evolved, its melting point dropped from more than 80 °C  to less than 60 °C. It also became more specific for penicillin, losing its ability to neutralize other drugs (J Am Chem Soc, 135:2899–902, 2013).
See full infographic: WEB | PDF
J Am Chem Soc, 135:2899–902, 2013

Medical uses for reconstructed proteins

ASR can also offer hints for bioengineers with medical applications in mind. That’s what happened with Gaucher’s investigations of uricase, a useful enzyme that was deactivated at some point in human evolution. While most animals depend on uricase to break down uric acid, apes possess only a dysfunctional pseudogene, which evolved from a functional precursor. Without uricase, uric acid can build up and crystallize in joints, where it causes pain and swelling known as gout. “Something happened when apes evolved,” says Gaucher.

To figure out how our uricase lost its mojo, Gaucher resurrected uricases from several points in the past. The prehistoric enzymes showed a stepwise progression from high activity to low. By the last common ancestor of all apes, which likely lived 20 million years ago, uricase’s activity was undetectable.12 At the same time, a uric acid transporter called URAT1 also evolved to increase the blood concentration of uric acid.

What pushed the ancestors of apes to maintain high uric acid, despite the attendant risks? Gaucher and his collaborators suspect it relates to our taste for fruit. High levels of uric acid promote the conversion of fruit sugars into fat. Around the time uric acid metabolism was changing, the climate cooled, causing fewer flowering trees to produce fruit. That meant frugivore primates needed to gorge on fruit whenever they found it, saving energy for a fruitless day. Dropping uricase would have allowed them to store more energy,14 and might have supported the development of bigger brains, Gaucher suggests. “This is undoubtedly one of the drivers that allowed apes to evolve.”

Now, the growing understanding of uricase’s antecedents—and the reconstructed versions of them—could offer a potential treatment for gout sufferers. There’s already a version of uricase on the market, a recombinant-bacteria-made chimeric protein assembled from pig and baboon sequences. However, when Gaucher and colleagues tested their ASR uricase in rats, its half-life in blood was nearly 100 times longer than the hybrid version.12  

Gaucher and collaborators have also used ASR to improve the stability and activity of Factor VIII, a coagulant used to treat hemophilia, and they believe the method could apply to any protein-based therapeutic.15 Gaucher founded a company, General Genomics, to develop uricase and other ancient proteins into medications, and for industrial or agricultural use. “It is quite cool that we are able to push the boundaries of biology a little bit, and engineer systems with ancestral genes,” says Kacar. “Now the question is, ‘How far can you push this?’


  1. M.A. Siddiq et al., “Experimental test and refutation of a classic case of molecular adaptation in Drosophila melanogaster,” Nat Ecol Evol, 1:0025, 2017.
  2. E.A. Gaucher et al., “Inferring the palaeoenvironment of ancient bacteria on the basis of resurrected proteins,” Nature, 425:285–88, 2003.
  3. E.A. Gaucher et al., “Palaeotemperature trend for Precambrian life inferred from resurrected proteins,” Nature, 451:704–7, 2008.
  4. B. Kacar et al., “Experimental evolution of Escherichia coli harboring an ancient translation protein,” J Mol Evol, 84:69–84, 2017.
  5. A. Delgado et al., “Using resurrected ancestral proviral proteins to engineer virus resistance,” Cell Rep, 19:1247–56, 2017.
  6. V.A. Risso et al., “Thermostable and promiscuous Precambrian proteins,” Environ Microbiol, 16:1485–89, 2014.
  7. H. Bar-Rogovsky et al., “The evolutionary origins of detoxifying enzymes: The mammalian serum paraoxonases (PONs) relate to bacterial homoserine lactonases,” J Biol Chem, 288:23914–27, 2013.
  8. D.L. Trudeau et al., “On the potential origins of the high stability of reconstructed ancestral proteins,” Mol Biol Evol, 33:2633–41, 2016.
  9. L.C. Wheeler et al., “The thermostability and specificity of ancient proteins,” Curr Opin Struct Biol, 38:37–43, 2016.
  10. V.A. Risso et al., “Phenotypic comparisons of consensus variants versus laboratory resurrections of Precambrian proteins,” Proteins, 82:887–96, 2014.
  11. V.A. Risso et al., “Hyperstability and substrate promiscuity in laboratory resurrections of Precambrian beta-lactamases,” J Am Chem Soc, 135:2899–902, 2013.
  12. J.T. Kratzer et al., “Evolutionary history and metabolic insights of ancient mammalian uricases,” PNAS, 111:3763–68, 2014.
  13. P.K. Tan et al., “Coevolution of URAT1 and uricase during primate evolution: Implications for serum urate homeostasis and gout,” Mol Biol Evol, 33:2193–200, 2016.
  14. R.J. Johnson, P. Andrews, “Fructose, uricase, and the back-to-Africa hypothesis,” Evol Anthropol, 19:250–57, 2010.
  15. P.M. Zakas et al., “Enhancing the pharmaceutical properties of protein drugs by ancestral sequence reconstruction,” Nat Biotechnol, 35:35–37, 2017.

Interested in reading more?

Climate Change

Become a Member of

Receive full access to digital editions of The Scientist, as well as TS Digest, feature stories, more than 35 years of archives, and much more!
Already a member?