ABOVE: Angelo Andrianiaina, a Malagasy graduate student in Brook’s lab, holds a Pteropus rufus bat captured in central Madagascar and sampled for metagenomic sequencing. THERESA LAVERTY/BROOK LAB

On December 18, 2019, Wuhan Central Hospital admitted a patient with symptoms common for the winter flu season: a 65-year-old man with fever and pneumonia. Ai Fen, director of the emergency department, oversaw a typical treatment plan, including antibiotics and anti-influenza drugs.

Six days later, the patient was still sick, and Ai was puzzled, according to news reports and a detailed reconstruction of this period by evolutionary biologist Michael Worobey. The respiratory department decided to try to identify the guilty pathogen by reading its genetic code, a process called sequencing. They rinsed part of the patient’s lungs with saline, collected the liquid, and sent the sample to a biotech company. On December 27, the hospital got the results: The man had contracted a new coronavirus closely related to the one that caused the SARS outbreak that began 17 years before.

The original SARS virus was sequenced five months after the first cases were recorded. This type of traditional sequencing reads the full genetic code, or genome, of just one organism at a time, which first needs to be carefully isolated from a sample. The researchers hired by Wuhan Central Hospital were able to map the new virus so quickly using a more demanding technique called metagenomic sequencing, which reads the genomes of every organism in a sample at once—without such time-intensive preparation. If the traditional approach is like locating a single book on a shelf and copying it, metagenomic sequencing is like grabbing all of the books off the shelf and scanning them all at once.

This ability to quickly read a range of genomes has proven useful in fields from ecology to cancer treatment. And the COVID-19 pandemic has pushed some researchers to use metagenomics to try to spot new diseases and respond to them earlier—before they become epidemics, and potentially before they even infect people. Some of these experts say the early spread of COVID-19 in the United States could have been curbed more quickly if the medical community had applied this technology.

“If metagenomic sequencing was done more routinely, maybe we would’ve known what it was when there were only 20 infections,” in the US, said Joe DeRisi, a professor of biochemistry and biophysics at the University of California, San Francisco and president of the Chan Zuckerberg Biohub, a nonprofit research center.

But while the raw power of metagenomics is clear, there are challenges to using it to squelch potential pandemics. The technique requires intensive computer processing, making it costlier than some others, and calls for greater expertise to interpret the results. Using the copious data metagenomics produce to guide treatment also raises quandaries about medical decision-making, when, for instance, it’s not clear whether a certain pathogen is causing a certain illness.

Still, advocates say the costs are worth it. “Metagenomics plays a critical role in pandemic preparedness, by looking for the things we don’t know to look for,” said Jessica Manning, an infectious disease researcher at the National Institute of Allergy and Infectious Diseases.


The rise of metagenomics over the past couple of decades is due in part to advances in genome sequencing. To read the contents of the genome, researchers first isolate the molecules that store genetic information, DNA and RNA, which are long chains of nucleotides, the letters of the genetic library. Then they cut the long molecules into shorter chunks and read the order of letters in each chunk. Finally, they combine the shorter “reads” to reconstruct the full genome.

Over the past 40 years, innovation, especially automation, dramatically improved every part of this process. The Human Genome Project, launched in 1990, took more than a decade of work coordinated between 20 research groups and cost around a billion dollars. Today a human genome can be sequenced more accurately, for less than one-millionth the cost, by one scientist in one day.

As the technology got better, researchers started trying to sequence many organisms at once, a complex task that requires figuring out how millions of short reads fit together to make any number of genomes. Eventually researchers wrote sophisticated software that can sort out the sequences using networks of powerful computers.

“It’s not uncommon to spin up hundreds to thousands of CPUs to do this job,” said DeRisi, who created a free online software package called Chan Zuckerberg ID that solves these thorny sequencing puzzles on computers in California, then sends the results out to users in far-flung locations.

Metagenomic sequencing quickly became indispensable in some fields of research, particularly where researchers study the mix of microorganisms in an environment. “The number of sequenced viruses is exploding,” said Edward Holmes, a pathogen expert at the University of Sydney. “In the old days, you cultured viruses and sequenced them one at a time. No more. Now it’s just metagenomics.”

In medicine, metagenomics can help explain illnesses that aren’t picked up by more routine tests like those for flu or strep. That was how Ai Fen and her colleagues at Wuhan Central Hospital happened upon some of the first evidence of the novel coronavirus.

But researchers are also using metagenomics more intentionally to try to detect outbreaks early on, perhaps preventing another pandemic. One obvious potential risk comes from coronaviruses, which have already caused two major new diseases in humans this century. Using metagenomics as a tool to find out how the viruses move between animals in Asia could give researchers early warning about the development of new human pathogens. For instance, in 2021, researchers in Cambodia metagenomically sequenced samples from local bats and found two of the closest known relatives of SARS-CoV-2. In 2019, a group in China, using the same approach, discovered that pangolins carry similar viruses and could be a vector for passing them to humans. And Zheng-Li Shi, who found the likely birthplace of SARS and led the Wuhan Institute of Virology’s announcement of SARS-CoV-2, has used metagenomics to chart viruses in bats.

Researchers are also using metagenomics to watch for pathogens in other parts of the world. Cara Brook, an evolutionary biologist at the University of Chicago, runs such a project in Madagascar, which she said is the kind of place where new diseases are likely to emerge—a tropical country, with limited health care, that’s home to bats that carry human pathogens like the Ebola virus. What’s more, people in Madagascar eat some of the larger bats, providing a ready opportunity for viruses to break out.

In November 2022, Brook headed out into the forest in Madagascar with four graduate students to gather samples from three species of bat. The largest, Pteropus rufus, can have a wingspan of around three feet and makes a sizable meal. “The pectoral muscle mass is impressive,” said Brook. “It’s like a steak.”

A Pteropus rufus fruit bat
With a wingspan around three feet, Pteropus rufus is the largest bat in Madagascar. People in the country eat the larger bats, providing opportunity for viruses to break out.
ANECIA GENTLES/BROOK LAB

Brook recently caught, in good health, a bat that her team had sampled and tagged back in 2013. Bats can live up to 40 years, far longer than other mammals of similar size. It's thought that, as the only flying mammals, bats have developed unique features in their immune systems that explain both their longevity and why they carry so many viruses. That in turn may be why bats are the sources of so many human diseases.

In 2022, Brook and her colleagues reported the sequences of two new coronaviruses in the journal Frontiers in Public Health. According to the paper, the viruses don’t seem to be a threat to people, but knowing more about their family tree could help researchers better understand how coronaviruses evolve into pathogenic varieties. Brook is also sequencing samples from people in Madagascar who have unexplained fevers to see if she can pick up on new pathogens that have already crossed over from another animal.

Other groups are similarly looking for such crossovers. In 2018, for example, a group of researchers set up a sentinel program at three hospitals in China, using metagenomic sequencing to test people who had fevers and recent exposure to animals. Over the next three years, they found 35 people who were sick with a previously unknown virus, which they described in the New England Journal of Medicine in August 2022. The researchers also tested animals near the patients’ homes and found the virus in goats, dogs, mice, one unlucky vole, and most often in shrews, which the researchers suspect are its natural reservoirs.

None of the patients died, and the researchers said the virus doesn’t likely spread between people. But if the virus evolves to become more dangerous, doctors may now be better prepared for it.


Many researchers told Undark that metagenomics should play a larger role in watching for outbreaks. Alexander Greninger, a microbiologist at the University of Washington, said the most obvious way to use the technique is by testing people who die without explanation. “This is the ultimate in, ‘Well, it doesn’t change patient management,’ but it’s important for the diagnostic enterprise to know what it’s missing,” said Greninger. “Aren’t we chiefly concerned about mortality for these new viral pandemics?” he asked. “It really is the canary in the coal mine.”

But there are barriers to everyday use. A key organizational problem in American health care is that insurers generally pay for traditional tests that cover one disease, Greninger said, rather than tests that look for many, like metagenomic sequencing.

Allowing for such tests could make a big difference. In early February 2020, three weeks before doctors knew COVID-19 was spreading in the U.S., a woman in the Bay Area with flu-like symptoms suddenly died. Her death baffled the local coroner and the woman’s family, who wondered if she had the novel coronavirus. But since she had not traveled to China recently, she was not tested until two months later, when she was identified as the first American to succumb to the disease. In the absence of widespread testing, DeRisi, based nearby in San Francisco, said his lab could have quickly recognized that she had COVID-19 if the health care system connected patients to metagenomics.

The CZ Biohub has trained hundreds of researchers, including Brook, to use DeRisi's metagenomics tool to identify pathogens around the world. “Bottom line, metagenomics is a great way to build an early-warning radar,” he said. In the future, he said, doctors and scientists may routinely use the technology to watch for both new and old diseases. The next generation of metagenomic sequencing, along with advances in related technologies, he added, “will be used to replace many of the main diagnostic systems we have for infectious diseases.”


Metagenomics has trade-offs. One issue is the price: Sequencing costs more than rapid tests for common infections, especially considering that most metagenomic tests of people find a relatively small number of common pathogens, which isn’t very valuable, said Greninger. In China, the cost of sequencing is lower, mostly because companies there sequence so many samples, giving them significant economy of scale.

Researchers have also pointed out that many metagenomic successes are case reports detecting new or unexpected pathogens in small numbers of people. It’s been harder to show how the tool should be applied systematically across a health system—under which circumstances to test a patient, for instance, and then how to act on complicated results. Greninger said the field has “fanboys” who promote metagenomics as flashy, big-data tech while downplaying the complexities. “Academics are in the business of selling the future,” he said.

Beyond those financial and medical hurdles, politics might present a higher one: Even if the tool were widespread, governments have to share the information in order for it to be useful. The debate over the origins of COVID “has gotten so toxic, people are less likely to collaborate now,” said Holmes. “If you had a novel infection in Russia, do you think we’ll ever hear about it? If there’s a novel infection in China, do I think the Chinese government will allow studies? No way. We’re in a worse situation now than we were before the pandemic.” DeRisi said the uncoordinated nature of the U.S. health system also stymies the kind of coordinated response needed to quickly stop outbreaks.

Even when it was used to quickly identify SARS-CoV-2, metagenomics smacked into political reality. At Wuhan Central Hospital in 2019, after Ai Fen heard that her patient was carrying a new coronavirus, she drew a red circle around the relevant text in a written report, took a photo of the document, and sent it to a group chat with other doctors in Wuhan. She soon got a severe rebuke from the hospital disciplinary committee for “spreading rumors” and “harming stability.” Chinese officials were gagging doctors in Wuhan, using their power to stop the spread of inconvenient truths.

Ai had no way of knowing how costly that delay would be. “Had I known what would happen today,” she told the Chinese magazine People, “I would not have cared about all the reprimands and criticisms and would have spoken up everywhere.”


Amos Zeeberg is a freelance journalist whose work has appeared in publications including The New YorkerThe Atlantic, and Discover.

This article was originally published on Undark. Read the original article.