Contaminated genomes

Human DNA sequences are found in nearly a quarter of the publically-available non-primate genomes, emphasizing the need for better quality control measures

Written byHannah Waters
| 3 min read

Register for free to listen to this article
Listen with Speechify
0:00
3:00
Share
More than 20 percent of non-primate genome sequences from the top public sequencing facilities are contaminated with human DNA, reports a linkurl:study;http://www.plosone.org/article/info:doi/10.1371/journal.pone.0016410 published today (February 16) in PLoS ONE.
A Sanger sequencing read
Image: Wikimedia commons, Loris
This research calls for scientists to work harder to ensure that the genomes they're sequencing do not become contaminated during the sequencing process, and, more importantly, to check for potential contamination in genomes pulled from the public databases on which genomes are normally deposited. "Genome contamination is a big problem -- but it's not new," said linkurl:Jonathan Eisen,;http://bobcat.genomecenter.ucdavis.edu/mediawiki/index.php/Main_Page evolutionary biologist at the University of California, Davis and lead of the phylogenomics program at the United States Department of Energy Joint Genome Institute. "This paper might help remind people of this [issue]."Contamination can be introduced into a genomic sequence at any number of stages. It could be airborne bacteria landing in a sample, or even DNA fragments floating around in reagents, left behind after sterilization. But probably the most common contaminant is the scientist herself. It just takes a skin cell falling into the sample before amplification. "Are you wearing gloves to protect yourself from your sample or your sample from you?" linkurl:Rachel O'Neill,;http://www.oneill.mcb.uconn.edu/R.ONeill_Lab/Home.html paper author and molecular geneticist at the University of Connecticut, wondered. "I think it's a little bit of both."A graduate student in O'Neill's lab was screening genome databases for conserved sequences, and was excited to find the same sequence across diverse species. However, when he tried to replicate the results in the lab, he failed, suggesting that the database genomes were contaminated. So he decided to screen all non-primate genomes housed in four public databases -- University of California, Santa Cruz's genome browser, National Center for Biotechnology Information's GenBank, the Joint Genome Institute, and Ensembl -- for human-specific repetitive sequences known as AluY elements.Of the 2,057 raw sequence genomes searched, 454 contained this human DNA sequence, or 22.39 percent. "The level of contamination we have found is high enough to show concern," said O'Neill. And that's just contamination from human sources, she added -- just imagine how much contamination could exist from species like E. coli or others commonly found in the lab.Eisen noted the flurry of papers reporting horizontal gene transfers between species, such as the linkurl:report;http://mbio.asm.org/content/2/1/e00005-11.long this week in mBio of human DNA acquired by gonorrhea, and wondered if this could simply be an issue of human DNA contaminating the data.The frequency of human contamination requires scientists to do extra experiments, to go above and beyond the norm to confirm their results, Eisen argued. "All you need is one cell to do something weird and you have the potential for all kinds of anomalies.""There is always that lingering doubt," linkurl:Mark Pallen,;http://pathogenomics.bham.ac.uk/staff/mpallen.html a microbial genomicist at the University of Birmingham, said of the gonorrhea sequence, though he added he thinks the gonorrhea example is probably a case of bona fide DNA transfer.The high level of sequence contamination could spell real trouble when it comes to human sequencing, O'Neill said. "Finding an Alu element from a human in a fish sample is very straightforward," she said. "Finding a human sample in a human sample is where the difficulty comes in." Relying on sequencing with such high human contamination to make decisions about personal health could be catastrophic.Moving forward, scientists must invest more in quality control, Eisen said, but the importance of this step can be lost behind the pressure to generate more data. "It would be nice if everybody took a step back and said that the quality of data is also important," he said. "But it's a hard argument to win; it's hard to convince myself in some cases."Longo, M.S., et al. "Abundant Human DNA Contamination Identified in Non-Primate Genome Databases." PLoS ONE, DOI: linkurl:10.1371/journal.pone.0016410;http://www.plosone.org/article/info:doi/10.1371/journal.pone.0016410
**__Related stories:__***linkurl:Sequencing on target;http://www.the-scientist.com/article/display/55645/
[1st May 2009]*linkurl:Bacterial genes jump to host;http://www.the-scientist.com/news/display/53552/
[30th August 2007]*linkurl:DNA Sequencing Industry Sets its Sights on the Future;http://www.the-scientist.com/2004/09/27/44/1/
[27th September 2004]* linkurl:Related F1000 evaluations;http://f1000.com/search/evaluations?query=genome+contamination
[16th February 2011]
Interested in reading more?

Become a Member of

The Scientist Logo
Receive full access to more than 35 years of archives, as well as TS Digest, digital editions of The Scientist, feature stories, and much more!
Already a member? Login Here

Meet the Author

Share
Image of a woman with her hands across her stomach. She has a look of discomfort on her face. There is a blown up image of her stomach next to her and it has colorful butterflies and gut bacteria all swarming within the gut.
November 2025, Issue 1

Why Do We Feel Butterflies in the Stomach?

These fluttering sensations are the brain’s reaction to certain emotions, which can be amplified or soothed by the gut’s own “bugs".

View this Issue
Olga Anczukow and Ryan Englander discuss how transcriptome splicing affects immune system function in lung cancer.

Long-Read RNA Sequencing Reveals a Regulatory Role for Splicing in Immunotherapy Responses

Pacific Biosciences logo
Research Roundtable: The Evolving World of Spatial Biology

Research Roundtable: The Evolving World of Spatial Biology

Conceptual cartoon image of gene editing technology

Exploring the State of the Art in Gene Editing Techniques

Bio-Rad
Conceptual image of a doctor holding a brain puzzle, representing Alzheimer's disease diagnosis.

Simplifying Early Alzheimer’s Disease Diagnosis with Blood Testing

fujirebio logo

Products

Labvantage Logo

LabVantage Solutions Awarded $22.3 Million U.S Customs and Border Protection Contract to Deliver Next-Generation Forensic LIMS

The Scientist Placeholder Image

Evosep Unveils Open Innovation Initiative to Expand Standardization in Proteomics

OGT logo

OGT expands MRD detection capabilities with new SureSeq Myeloid MRD Plus NGS Panel