“It’s surprising that there was such a large difference,” says James Cole, a molecular microbial ecologist at Michigan State University, who was not involved in the work. “It sends out a caution to people using these types of tools to make sure that their tool is actually performing well on the types of samples that they’re working on.”
The amplicon approach matches samples to known bacterial taxa based on a single sequence: that of the highly conserved gene for bacterial 16S rRNA. Shotgun sequencing instead takes a genome-wide approach, targeting random sections of bacterial DNA and matching the resulting profile to a database using common sequences or clade-specific marker genes.
But while analyzing data for a metagenomics project at several floodplain sites in Brazil, the AMNH team members realized they had enough samples to perform their own, large-scale evaluation of the methods on a poorly studied system. “We thought, hey, we can compare amplicon and shotgun—let’s go for it,” says study coauthor Michael Tessler, a recent graduate from the AMNH’s Richard Gilder Graduate School. “We were thinking that shotgun was going to be equal or better.”
To their surprise, however, the researchers found that shotgun sequencing recovered less than 50 percent of the phyla identified in the Brazilian water samples by amplicon sequencing. What’s more, the amplicon approach detected 27 percent more families, despite using just a fraction of the volume of DNA.
“It’s pretty staggering,” says Tessler, adding that even the distribution of taxa discovered by shotgun sequencing differed. “Not only are there fewer phyla and families, but there are just a few families that really dominate the sample.”
The authors put the poor performance of shotgun sequencing mainly down to the weakness of the database used in the study, as compared to databases for the 16S rRNA gene. “We’ve been sequencing that gene for the longest time,” explains study coauthor Mercer Brugler, a research associate at the AMNH and an assistant professor in biology at the NYC College of Technology. “We’ve got a huge database for 16S.” In databases generally used for shotgun sequencing, “the availability of genomes to query is quite limited for these unique, remote environments.”
Metagenomics and bioinformatics researcher Juan Jovel of the University of Alberta seconds this interpretation. “For specific environments, the completeness of databases will determine how effectively bacterial sequences can be classified to taxa using different approaches,” he says. Although the authors only used one type of analysis for their shotgun data—MetaPhlAn, a popular computational tool also used in the Human Microbiome Project—the results underline the fact that “there is a huge diversity of microorganisms in remote environments,” says Jovel, who was not involved in the present study, “and we are just starting to explore that diversity.”
Brugler agrees that the current results are unlikely to be the last word. “Our findings are going to change,” he says, adding that the team is planning to examine other databases, too. “Shotgun, I think, is going to become the go-to technology as these databases improve . . . but if it’s a non-human system, I don’t know if we’re ready for it just yet.”
In the meantime, the fact that the two methods showed such different results suggests that researchers might be wise to avoid “overconfidence” in any one method, particularly for less-studied systems, Tessler notes. “In science, often we want an answer quickly,” he says. “But taking the time to do a pilot study [with each sequencing technique], even with a few samples, could provide a lot of information about what each of these two methods might do for you. It’s worth showing you did your due diligence.”
M. Tessler et al., “Large-scale differences in microbial diversity discovery between 16S amplicon and shotgun sequencing,” Scientific Reports, 7:6589, 2017.