Mistaken Identities

Researchers are working to automate the arduous task of identifying—and amending—mislabeled sequences in genetic databases.

kerry grens
| 5 min read

Register for free to listen to this article
Listen with Speechify
0:00
5:00
Share

FLICKR, KEVIN MACKENZIE, UNIVERSITY OF ABERDEENResearchers at King’s College London were working on some human gene expression experiments in 2008 when they got a strong match to one of the probe sequences in an Affymetrix microarray. The only available information on the gene from the chip was that it was a human sequence, recalled William Langdon, who helped on the project. So the team did a BLAST search to look up more information. “And the first thing you get back is, of course, the human sequence itself,” said Langdon, who is now at University College London. But when he scanned down the list of the other related sequences that appeared in the search, it was apparent something was amiss. “They [were] all different species of Mycoplasma.”

It appeared a case of mistaken identity; the original submitter of the sequence to GenBank must have had Mycoplasma contamination in a human sample, and assumed the sequence was human. In a study Langdon and colleagues published in 2009, the authors show the striking resembling between this “human” sequence and a particular marker sequence from various Mycoplasma species.

To this day, the sequence is still labeled “Homo sapiens unknown” in the National Center for Biotechnology Information (NCBI) database Genbank. This misnomer represents one of the hundreds—perhaps thousands—of sequences deposited to GenBank and elsewhere that have been assigned to the wrong taxon.

That errors exist in GenBank and other databases is a truism. But correcting mislabeled sequences is a difficult task, one that database stewards and ...

Interested in reading more?

Become a Member of

The Scientist Logo
Receive full access to more than 35 years of archives, as well as TS Digest, digital editions of The Scientist, feature stories, and much more!
Already a member? Login Here

Keywords

Meet the Author

  • kerry grens

    Kerry Grens

    Kerry served as The Scientist’s news director until 2021. Before joining The Scientist in 2013, she was a stringer for Reuters Health, the senior health and science reporter at WHYY in Philadelphia, and the health and science reporter at New Hampshire Public Radio. Kerry got her start in journalism as a AAAS Mass Media fellow at KUNC in Colorado. She has a master’s in biological sciences from Stanford University and a biology degree from Loyola University Chicago.

Share
Image of small blue creatures called Nergals. Some have hearts above their heads, which signify friendship. There is one Nergal who is sneezing and losing health, which is denoted by minus one signs floating around it.
June 2025, Issue 1

Nergal Networks: Where Friendship Meets Infection

A citizen science game explores how social choices and networks can influence how an illness moves through a population.

View this Issue
Unraveling Complex Biology with Advanced Multiomics Technology

Unraveling Complex Biology with Five-Dimensional Multiomics

Element Bioscience Logo
Resurrecting Plant Defense Mechanisms to Avoid Crop Pathogens

Resurrecting Plant Defense Mechanisms to Avoid Crop Pathogens

Twist Bio 
The Scientist Placeholder Image

Seeing and Sorting with Confidence

BD
The Scientist Placeholder Image

Streamlining Microbial Quality Control Testing

MicroQuant™ by ATCC logo

Products

parse-biosciences-logo

Pioneering Cancer Plasticity Atlas will help Predict Response to Cancer Therapies

waters-logo

How Alderley Analytical are Delivering eXtreme Robustness in Bioanalysis

Nuclera’s eProtein Discovery

Nuclera and Cytiva collaborate to accelerate characterization of proteins for drug development