Ocean home to new proteins, families

Sequencing the sea's microbial communities uncovers millions of new proteins and thousands of unknown protein families

Written byMelissa Lee Phillips
| 3 min read

Register for free to listen to this article
Listen with Speechify
0:00
3:00
Share
A sampling of genetic sequences from ocean microbial communities reveals millions of new proteins and thousands of new protein families, according to a report in this months Public Library of Science Biology. The analysis also suggests that continued sampling of microbial communities will reveal novel protein families "for some time to come," said study first author Shibu Yooseph of the J. Craig Venter Institute in Rockville, Md.From 2003 to 2005, the Sorcerer II Global Ocean Sampling expedition collected seawater samples from a range of the world's oceans. Yooseph and his colleagues analyzed the genetic fragments in these samples using the metagenomics technique of shotgun sequencing. They assembled 7.7 million microbial sequences and predicted that these sequences code for 6.12 million proteins -- nearly twice the number of proteins in current databases, according to the authors. "The actual number of proteins might not be surprising to some people," Yooseph told The Scientist, but "we were really surprised with the amount of diversity."The authors found that many of the newly discovered proteins clustered into previously unknown protein families. They have evidence for at least 2,000 fairly large clusters of novel protein families, Yooseph said.These findings suggest that additional analyses of samples from other environments -- such as soil or the deep sea -- "will continue to reveal a great deal of novelty in terms of proteins," Yooseph said."It's been known for some years that we still are in a linear phase in terms of protein family discovery," said Darren Natale of Georgetown University Medical Center in Washington, D.C., who was not involved in the study. But "it's nice that the dataset here is so large and that it still holds," he told The Scientist. "Having this enormous additional number of sequences from very unfamiliar organisms will be immensely useful," said Cyrus Chothia of MRC Laboratory of Molecular Biology in Cambridge, UK. "It will give us much greater information about how diverse families can be."However, Chothia added that relying solely on sequence information to determine protein families may be misleading. Without structural or functional information about these proteins, it remains possible that some of the new proteins are simply relatives of known proteins whose sequences diverged greatly, Chothia said. "It may well be true that they found entirely new families, but it could be true that they are very distant relatives of known families," he told The Scientist.Yooseph and his colleagues also found that several protein domains thought to be kingdom-specific are actually found in more than one kingdom of life. These findings suggest either that some lineages are more ancient than previously thought, or that these shared sequences have jumped kingdoms through lateral gene transfer, Yooseph said. "We will need to look at those on a case-by-case basis."The expedition's samples also turned up more sequences of viral origin than suspected, Yooseph said, indicating that researchers are far from fully exploring the diversity of viruses. For example, the researchers found that at least two protein families -- UV repair enzymes and glutamine synthetase -- contain many new viral additions.In accompanying papers in the same issue of PLoS Biology, researchers present analyses of two other aspects of Sorcerer II data. In the first paper, Douglas B. Rusch of the Venter Institute and colleagues analyze genome structure and evolution in the microbial samples and present new methods for measuring the genomic similarity between metagenomic samples. They also show various ways in which oceanic organisms differ based on their location or environmental pressures. In the other paper, Natarajan Kannan of the University of California, San Diego and colleagues examine the protein kinase-like (PKL) superfamily, and report that these proteins cluster into 20 major families which contain many family-specific features.Melissa Lee Phillips mail@the-scientist.comLinks within this articleS. Yooseph et al., "The Sorcerer II Global Ocean Sampling Expedition: Expanding the Universe of Protein Families," PLoS Biology, March 2007. http://biology.plosjournals.org/perlserv/?request=get-document&doi=10.1371/journal.pbio.0050016J.M. Perkel, "The big picture in microbial genomics," The Scientist, July 1, 2006. http://www.the-scientist.com/article/display/23800T.M. Powledge, "Shotgun sequencing comes of age," The Scientist, December 31, 2002. http://www.the-scientist.com/article/display/20975/J.M. Perkel, "Bacterial census of Texas air reveals microbial diversity," The Scientist, December 19, 2006. http://www.the-scientist.com/news/display/38145/Darren Natale http://pir.georgetown.edu/pirwww/aboutpir/natalebio.shtmlCyrus Chothia http://www.mrc-lmb.cam.ac.uk/genomes/TCB/SGG.htmlD.B. Rusch et al., "The Sorcerer II Global Ocean Sampling Expedition: Northwest Atlantic through Eastern Tropical Pacific," PLoS Biology, March 2007. http://biology.plosjournals.org/perlserv/?request=get-document&doi=10.1371/journal.pbio.0050077N. Kannan et al., "Structural and Functional Diversity of the Microbial Kinome," PLoS Biology, March 2007. http://biology.plosjournals.org/perlserv/?request=get-document&doi=10.1371/journal.pbio.0050017
Interested in reading more?

Become a Member of

The Scientist Logo
Receive full access to more than 35 years of archives, as well as TS Digest, digital editions of The Scientist, feature stories, and much more!
Already a member? Login Here

Meet the Author

Share
Illustration of a developing fetus surrounded by a clear fluid with a subtle yellow tinge, representing amniotic fluid.
January 2026

What Is the Amniotic Fluid Composed of?

The liquid world of fetal development provides a rich source of nutrition and protection tailored to meet the needs of the growing fetus.

View this Issue
Human-Relevant In Vitro Models Enable Predictive Drug Discovery

Advancing Drug Discovery with Complex Human In Vitro Models

Stemcell Technologies
Redefining Immunology Through Advanced Technologies

Redefining Immunology Through Advanced Technologies

Ensuring Regulatory Compliance in AAV Manufacturing with Analytical Ultracentrifugation

Ensuring Regulatory Compliance in AAV Manufacturing with Analytical Ultracentrifugation

Beckman Coulter Logo
Conceptual multicolored vector image of cancer research, depicting various biomedical approaches to cancer therapy

Maximizing Cancer Research Model Systems

bioxcell

Products

Refeyn logo

Refeyn named in the Sunday Times 100 Tech list of the UK’s fastest-growing technology companies

Parse Logo

Parse Biosciences and Graph Therapeutics Partner to Build Large Functional Immune Perturbation Atlas

Sino Biological Logo

Sino Biological's Launch of SwiftFluo® TR-FRET Kits Pioneers a New Era in High-Throughout Kinase Inhibitor Screening

SPT Labtech Logo

SPT Labtech enables automated Twist Bioscience NGS library preparation workflows on SPT's firefly platform