Much of the history of genetics and molecular biology was built using microorganisms such as Escherichia coli and Staphylococcus aureus. Yet these species are the exception, not the rule, in microbiology: Unlike the vast majority of microbes, they can be cultivated in the laboratory, grown in pure culture, and thus, dissected biochemically.
Studying the unculturable posed a problem. In the 1980s microbiologists began using 16S ribosomal RNAs to catalog and enumerate such species in the environment. But this approach says next to nothing about the species themselves. This issue's two Hot Papers represent the next step forward. Jill Banfield, professor of earth and planetary science at the University of California, Berkeley, and J. Craig Venter, head of the eponymous Institute, independently published in early 2004 two studies employing metagenomics to survey and reconstruct genomes from two very different microbial ecosystems.
In metagenomics (or environmental genomics) total genetic material is collected and sequenced from an environmental sample, rather than from individual isolates. "What you're really doing is trying to infer the properties of microbes within complex microbial communities en masse," says Ed DeLong, a professor of civil and environmental engineering at the Massachusetts Institute of Technology, who uses metagenomics in his own research.
Fishing and Mining
Though neither study invented metagenomics, they pushed the technological envelope, attempting to assemble entire genomes from uncultivated organisms - a feat akin to piecing together thousands of different jigsaw puzzles from a single pool of pieces, without the pictures on the box. "I think it's a whole new way of looking at diversity on this planet," says Venter. "The studies that have been going on for the past 30-some-odd years in molecular biology have only given us a very narrow slice of biological systems."
Data derived from the Science Watch/Hot Papers database and the Web of Science (Thomson Scientific, Philadelphia) show that Hot Papers are cited 50 to 100 times more often than the average paper of the same type and age.
G.W. Tyson et al., "Community structure and metabolism through reconstruction of microbial genomes from the environment," Nature, 428:37-43, 2004. (Cited in 128 papers, Hist Cite Analysis)
J.C. Venter et al., "Environmental genome shotgun sequencing of the Sargasso Sea," Science, 304:66-74, 2004. (Cited in 311 papers, Hist Cite Analysis)
Banfield's team used some 76 million bases of sequence to reconstruct two near-complete bacterial and archaeal genomes, and partially reconstruct three others, from an acidic biofilm obtained from a mine in Iron Mountain, Calif.1 Venter obtained more than 1 billion bases and 1.2 million protein-coding genes, representing between 1,800 and 40,000 species, from the Sargasso Sea.2 Both teams used those data to make inferences about the ecology of those environments.
The Banfield study "was really the first demonstration that you could apply shotgun sequencing techniques to reassemble a composite genome from a natural population," says DeLong. That population, a pink biofilm on the surface of an acid mine drainage system, was relatively simple, containing only five different species. "Venter's study was much more ambitious," says DeLong, "because they went to oceanic microbial populations."
Venter assembled a few nearly complete bacterial genomes - one Burkholderia and two Shewanella species. Though some researchers, notably DeLong,3 questioned whether these are actually residents of the Sargasso Sea and not contaminants, the study broke ground. They showed, DeLong says, that "with one very deep pass, a whole lot of new genomic components, biochemical capabilities, and so on, could be uncovered."
'It Takes a Village'
Metagenomics may have been born of necessity, but it actually makes sense in terms of studying bacteria in the wild, says DeLong. "Environmental processes are generally never catalyzed by a single microbe," he explains. Instead, microbial consortia work together to drive biogeochemical processes. "What I like to say is, 'it takes a village'."
In Banfield's acid mine drainage system, for instance, five microbial species cooperate to metabolize dissolved iron. Sequence data suggested that only one of the five (Leptospirillum group III) appeared capable of fixing gaseous nitrogen into a bioavailable form for the community. Banfield's team succeeded in cultivating that organism - the first member of its genera to be cultured - by using growth media free of any nitrogen sources.4 "That was incredibly cool," says Jo Handelsman, a metagenomics researcher at the University of Wisconsin, Madison, "because it was physiological information from the genome sequences that led her to that strategy."
Since his 2004 study Venter has upped the metagenomics ante. Circumnavigating the globe on the Sorcerer II (www.sorcerer2expedition.org), Venter collected water samples for genomic analysis every 200 miles. His team has collected "multiple billions of base pairs, and over six million new genes," Venter says. He expects to publish initial findings "probably sometime this summer."
Banfield's team has moved beyond metagenomics into "community proteomics," surveying the protein complement of their biofilm samples and identifying 2,033 proteins, including a low-abundance cytochrome that appears to be responsible for the dominant organism in the community's ability to oxidize iron.5
Meanwhile, other researchers continue to sift through the data Banfield and Venter generated more than two years ago. In one such study, Ed Rubin, director of the Department of Energy's Joint Genome Institute at Lawrence Berkeley National Labs, and colleagues used the two datasets, among others, to identify genomic signatures for the different types of organisms and metabolic capabilities that define different ecosystems - an approach DeLong calls "comparative community genomics."6
Microbial communities, Rubin explains, are too complex to be understood at the organismal level, so he takes a more gene-centric approach. One particular soil community, for instance, was found to be rich in potassium transporters. "We went back and looked at the soil and found out it was rich in potassium," Rubin says. Likewise, shallow-water communities tended to be rich in photosynthetic genes, whereas deep-water communities were not. Extrapolating from those observations, Rubin says it might similarly be possible to identify fingerprints of environments that contain oil or that are particularly fertile. "You can use these fingerprints as a diagnostic, a gene-based diagnostic to tell you about environments."
"For a lot of us that's pretty exciting," says DeLong, "because we're getting access to ecological information that was inaccessible until some of these techniques came along."