Contaminated genomes

Human DNA sequences are found in nearly a quarter of the publically-available non-primate genomes, emphasizing the need for better quality control measures

Written byHannah Waters
| 3 min read

Register for free to listen to this article
Listen with Speechify
0:00
3:00
Share
More than 20 percent of non-primate genome sequences from the top public sequencing facilities are contaminated with human DNA, reports a linkurl:study;http://www.plosone.org/article/info:doi/10.1371/journal.pone.0016410 published today (February 16) in PLoS ONE.
A Sanger sequencing read
Image: Wikimedia commons, Loris
This research calls for scientists to work harder to ensure that the genomes they're sequencing do not become contaminated during the sequencing process, and, more importantly, to check for potential contamination in genomes pulled from the public databases on which genomes are normally deposited. "Genome contamination is a big problem -- but it's not new," said linkurl:Jonathan Eisen,;http://bobcat.genomecenter.ucdavis.edu/mediawiki/index.php/Main_Page evolutionary biologist at the University of California, Davis and lead of the phylogenomics program at the United States Department of Energy Joint Genome Institute. "This paper might help remind people of this [issue]."Contamination can be introduced into a genomic sequence at any number of stages. It could be airborne bacteria landing in a sample, or even DNA fragments floating around in reagents, left behind after sterilization. But probably the most common contaminant is the scientist herself. It just takes a skin cell falling into the sample before amplification. "Are you wearing gloves to protect yourself from your sample or your sample from you?" linkurl:Rachel O'Neill,;http://www.oneill.mcb.uconn.edu/R.ONeill_Lab/Home.html paper author and molecular geneticist at the University of Connecticut, wondered. "I think it's a little bit of both."A graduate student in O'Neill's lab was screening genome databases for conserved sequences, and was excited to find the same sequence across diverse species. However, when he tried to replicate the results in the lab, he failed, suggesting that the database genomes were contaminated. So he decided to screen all non-primate genomes housed in four public databases -- University of California, Santa Cruz's genome browser, National Center for Biotechnology Information's GenBank, the Joint Genome Institute, and Ensembl -- for human-specific repetitive sequences known as AluY elements.Of the 2,057 raw sequence genomes searched, 454 contained this human DNA sequence, or 22.39 percent. "The level of contamination we have found is high enough to show concern," said O'Neill. And that's just contamination from human sources, she added -- just imagine how much contamination could exist from species like E. coli or others commonly found in the lab.Eisen noted the flurry of papers reporting horizontal gene transfers between species, such as the linkurl:report;http://mbio.asm.org/content/2/1/e00005-11.long this week in mBio of human DNA acquired by gonorrhea, and wondered if this could simply be an issue of human DNA contaminating the data.The frequency of human contamination requires scientists to do extra experiments, to go above and beyond the norm to confirm their results, Eisen argued. "All you need is one cell to do something weird and you have the potential for all kinds of anomalies.""There is always that lingering doubt," linkurl:Mark Pallen,;http://pathogenomics.bham.ac.uk/staff/mpallen.html a microbial genomicist at the University of Birmingham, said of the gonorrhea sequence, though he added he thinks the gonorrhea example is probably a case of bona fide DNA transfer.The high level of sequence contamination could spell real trouble when it comes to human sequencing, O'Neill said. "Finding an Alu element from a human in a fish sample is very straightforward," she said. "Finding a human sample in a human sample is where the difficulty comes in." Relying on sequencing with such high human contamination to make decisions about personal health could be catastrophic.Moving forward, scientists must invest more in quality control, Eisen said, but the importance of this step can be lost behind the pressure to generate more data. "It would be nice if everybody took a step back and said that the quality of data is also important," he said. "But it's a hard argument to win; it's hard to convince myself in some cases."Longo, M.S., et al. "Abundant Human DNA Contamination Identified in Non-Primate Genome Databases." PLoS ONE, DOI: linkurl:10.1371/journal.pone.0016410;http://www.plosone.org/article/info:doi/10.1371/journal.pone.0016410
**__Related stories:__***linkurl:Sequencing on target;http://www.the-scientist.com/article/display/55645/
[1st May 2009]*linkurl:Bacterial genes jump to host;http://www.the-scientist.com/news/display/53552/
[30th August 2007]*linkurl:DNA Sequencing Industry Sets its Sights on the Future;http://www.the-scientist.com/2004/09/27/44/1/
[27th September 2004]* linkurl:Related F1000 evaluations;http://f1000.com/search/evaluations?query=genome+contamination
[16th February 2011]
Interested in reading more?

Become a Member of

The Scientist Logo
Receive full access to more than 35 years of archives, as well as TS Digest, digital editions of The Scientist, feature stories, and much more!
Already a member? Login Here

Meet the Author

Share
Illustration of a developing fetus surrounded by a clear fluid with a subtle yellow tinge, representing amniotic fluid.
January 2026

What Is the Amniotic Fluid Composed of?

The liquid world of fetal development provides a rich source of nutrition and protection tailored to meet the needs of the growing fetus.

View this Issue
Human-Relevant In Vitro Models Enable Predictive Drug Discovery

Advancing Drug Discovery with Complex Human In Vitro Models

Stemcell Technologies
Redefining Immunology Through Advanced Technologies

Redefining Immunology Through Advanced Technologies

Ensuring Regulatory Compliance in AAV Manufacturing with Analytical Ultracentrifugation

Ensuring Regulatory Compliance in AAV Manufacturing with Analytical Ultracentrifugation

Beckman Coulter Logo
Conceptual multicolored vector image of cancer research, depicting various biomedical approaches to cancer therapy

Maximizing Cancer Research Model Systems

bioxcell

Products

Refeyn logo

Refeyn named in the Sunday Times 100 Tech list of the UK’s fastest-growing technology companies

Parse Logo

Parse Biosciences and Graph Therapeutics Partner to Build Large Functional Immune Perturbation Atlas

Sino Biological Logo

Sino Biological's Launch of SwiftFluo® TR-FRET Kits Pioneers a New Era in High-Throughout Kinase Inhibitor Screening

SPT Labtech Logo

SPT Labtech enables automated Twist Bioscience NGS library preparation workflows on SPT's firefly platform