Technical Bias Widespread in RNA-Seq Datasets

Genes that are exceptionally long or short are overrepresented in some published reports, which can lead to misinterpreted results.

Written byDiana Kwon
| 3 min read
rna-seq rna sequencing bias dataset transcript

Register for free to listen to this article
Listen with Speechify
0:00
3:00
Share

ABOVE: © ISTOCK.COM, SHUOSHU

RNA sequencing is a popular tool among molecular biologists, because it allows them to examine gene expression patterns in DNA. However, the technique is susceptible to experimental artifacts, which can lead to misinterpreted findings. According to a study published last week (November 12) in PLOS Biology, one such bias, which is associated with gene length, is widespread in many published datasets.

Rani Elkon, a bioinformatician at Tel Aviv University in Israel, says that his team was analyzing RNA sequencing (RNA-seq) datasets for a project aimed at infering the co-regulation of genes by examining their co-expression across many different biological conditions when they stumbled upon a puzzling finding: Genes coding for proteins in the ribosome or other translation-related machinery—which are exceptionally short—and genes coding for extracellular matrix proteins such as collagen—which are exceptionally long—kept popping up in their analyses. “In many different datasets, genes that were upregulated ...

Interested in reading more?

Become a Member of

The Scientist Logo
Receive full access to more than 35 years of archives, as well as TS Digest, digital editions of The Scientist, feature stories, and much more!
Already a member? Login Here

Related Topics

Meet the Author

  • Diana is a freelance science journalist who covers the life sciences, health, and academic life. She’s a regular contributor to The Scientist and her work has appeared in several other publications, including Scientific American, Knowable, and Quanta. Diana was a former intern at The Scientist and she holds a master’s degree in neuroscience from McGill University. She’s currently based in Berlin, Germany.

    View Full Profile
Share
Illustration of a developing fetus surrounded by a clear fluid with a subtle yellow tinge, representing amniotic fluid.
January 2026, Issue 1

What Is the Amniotic Fluid Composed of?

The liquid world of fetal development provides a rich source of nutrition and protection tailored to meet the needs of the growing fetus.

View this Issue
Skip the Wait for Protein Stability Data with Aunty

Skip the Wait for Protein Stability Data with Aunty

Unchained Labs
Graphic of three DNA helices in various colors

An Automated DNA-to-Data Framework for Production-Scale Sequencing

illumina
Exploring Cellular Organization with Spatial Proteomics

Exploring Cellular Organization with Spatial Proteomics

Abstract illustration of spheres with multiple layers, representing endoderm, ectoderm, and mesoderm derived organoids

Organoid Origins and How to Grow Them

Thermo Fisher Logo

Products

Brandtech Logo

BRANDTECH Scientific Introduces the Transferpette® pro Micropipette: A New Twist on Comfort and Control

Biotium Logo

Biotium Launches GlycoLiner™ Cell Surface Glycoprotein Labeling Kits for Rapid and Selective Cell Surface Imaging

Colorful abstract spiral dot pattern on a black background

Thermo Scientific X and S Series General Purpose Centrifuges

Thermo Fisher Logo
Abstract background with red and blue laser lights

VANTAstar Flexible microplate reader with simplified workflows

BMG LABTECH