Human Gene Set Shrinks Again

Proteomic data suggest the human genome may encode fewer than 20,000 genes.

Written byJyoti Madhusoodanan
| 2 min read

Register for free to listen to this article
Listen with Speechify
0:00
2:00
Share

WIKIMEDIA, WEBRIDGEAn analysis of proteomic data from seven studies suggests the human genome contains fewer than 20,000 protein-coding genes, 1,700 fewer than previously predicted. The results, published last month (June 16) in Human Molecular Genetics and posted last year on arXiv.org, also show little evidence of protein expression for more evolutionarily recent genes that can only be traced back to primate lineages.

The protein-coding region of the human genome has been shrinking since its discovery. The first sequences published in 2001 predicted 26,000—30,000 genes; a recent evolutionary comparison suggested the number was closer to 20,500. Now, that number might be reduced to approximately 19,000.

According to the Physics arXiv Blog, written in January, “That’s an interesting result that is partly a reflection of the state of genomics. The human genome is by no means fully defined and biologists are still in the process of refining their gene models and withdrawing genes in the process.”

"The coding part of the genome [which produces proteins] is constantly moving," lead author Alfonso Valencia of the Spanish National Cancer Research Centre said in a statement. "No one could have imagined a ...

Interested in reading more?

Become a Member of

The Scientist Logo
Receive full access to more than 35 years of archives, as well as TS Digest, digital editions of The Scientist, feature stories, and much more!
Already a member? Login Here
Illustration of a developing fetus surrounded by a clear fluid with a subtle yellow tinge, representing amniotic fluid.
January 2026, Issue 1

What Is the Amniotic Fluid Composed of?

The liquid world of fetal development provides a rich source of nutrition and protection tailored to meet the needs of the growing fetus.

View this Issue
Skip the Wait for Protein Stability Data with Aunty

Skip the Wait for Protein Stability Data with Aunty

Unchained Labs
Graphic of three DNA helices in various colors

An Automated DNA-to-Data Framework for Production-Scale Sequencing

illumina
Exploring Cellular Organization with Spatial Proteomics

Exploring Cellular Organization with Spatial Proteomics

Abstract illustration of spheres with multiple layers, representing endoderm, ectoderm, and mesoderm derived organoids

Organoid Origins and How to Grow Them

Thermo Fisher Logo

Products

Brandtech Logo

BRANDTECH Scientific Introduces the Transferpette® pro Micropipette: A New Twist on Comfort and Control

Biotium Logo

Biotium Launches GlycoLiner™ Cell Surface Glycoprotein Labeling Kits for Rapid and Selective Cell Surface Imaging

Colorful abstract spiral dot pattern on a black background

Thermo Scientific X and S Series General Purpose Centrifuges

Thermo Fisher Logo
Abstract background with red and blue laser lights

VANTAstar Flexible microplate reader with simplified workflows

BMG LABTECH