Mistaken Identities

Researchers are working to automate the arduous task of identifying—and amending—mislabeled sequences in genetic databases.

kerry grens
| 5 min read

Register for free to listen to this article
Listen with Speechify
0:00
5:00
Share

FLICKR, KEVIN MACKENZIE, UNIVERSITY OF ABERDEENResearchers at King’s College London were working on some human gene expression experiments in 2008 when they got a strong match to one of the probe sequences in an Affymetrix microarray. The only available information on the gene from the chip was that it was a human sequence, recalled William Langdon, who helped on the project. So the team did a BLAST search to look up more information. “And the first thing you get back is, of course, the human sequence itself,” said Langdon, who is now at University College London. But when he scanned down the list of the other related sequences that appeared in the search, it was apparent something was amiss. “They [were] all different species of Mycoplasma.”

It appeared a case of mistaken identity; the original submitter of the sequence to GenBank must have had Mycoplasma contamination in a human sample, and assumed the sequence was human. In a study Langdon and colleagues published in 2009, the authors show the striking resembling between this “human” sequence and a particular marker sequence from various Mycoplasma species.

To this day, the sequence is still labeled “Homo sapiens unknown” in the National Center for Biotechnology Information (NCBI) database Genbank. This misnomer represents one of the hundreds—perhaps thousands—of sequences deposited to GenBank and elsewhere that have been assigned to the wrong taxon.

That errors exist in GenBank and other databases is a truism. But correcting mislabeled sequences is a difficult task, one that database stewards and ...

Interested in reading more?

Become a Member of

The Scientist Logo
Receive full access to more than 35 years of archives, as well as TS Digest, digital editions of The Scientist, feature stories, and much more!
Already a member? Login Here

Keywords

Meet the Author

  • kerry grens

    Kerry Grens

    Kerry served as The Scientist’s news director until 2021. Before joining The Scientist in 2013, she was a stringer for Reuters Health, the senior health and science reporter at WHYY in Philadelphia, and the health and science reporter at New Hampshire Public Radio. Kerry got her start in journalism as a AAAS Mass Media fellow at KUNC in Colorado. She has a master’s in biological sciences from Stanford University and a biology degree from Loyola University Chicago.

Share
3D illustration of a gold lipid nanoparticle with pink nucleic acid inside of it. Purple and teal spikes stick out from the lipid bilayer representing polyethylene glycol.
February 2025, Issue 1

A Nanoparticle Delivery System for Gene Therapy

A reimagined lipid vehicle for nucleic acids could overcome the limitations of current vectors.

View this Issue
Considerations for Cell-Based Assays in Immuno-Oncology Research

Considerations for Cell-Based Assays in Immuno-Oncology Research

Lonza
An illustration of animal and tree silhouettes.

From Water Bears to Grizzly Bears: Unusual Animal Models

Taconic Biosciences
Sex Differences in Neurological Research

Sex Differences in Neurological Research

bit.bio logo
New Frontiers in Vaccine Development

New Frontiers in Vaccine Development

Sino

Products

Tecan Logo

Tecan introduces Veya: bringing digital, scalable automation to labs worldwide

Explore a Concise Guide to Optimizing Viral Transduction

A Visual Guide to Lentiviral Gene Delivery

Takara Bio
Inventia Life Science

Inventia Life Science Launches RASTRUM™ Allegro to Revolutionize High-Throughput 3D Cell Culture for Drug Discovery and Disease Research

An illustration of differently shaped viruses.

Detecting Novel Viruses Using a Comprehensive Enrichment Panel

Twist Bio