Menu

New Database Expands Number of Estimated Human Protein-Coding Genes

Some scientists are not yet convinced that the list is accurate.

Jun 19, 2018
Diana Kwon

ISTOCK, BLACKJACK3D

The human genome may contain more protein-coding genes than prior analyses suggested. A study published last month (May 29) on BioRxiv provides an expanded database of approximately 5,000 novel genes—of those, around 1,000 code for proteins, expanding the estimated number of protein-coding genes from around 20,000 to 21,000.

“If people like our gene list, then maybe a couple years from now we’ll be the arbiter of human genes,” study coauthor Steven Salzberg, a computational biologist at Johns Hopkins University, tells Nature.

Salzberg and his colleagues compiled a catalog of human genes and transcripts using data from the Genotype-Tissue Expression (GTEx) project, in which scientists sequenced the RNA from various tissues in hundreds of human subjects. By comparing the sequenced RNA to the human genome, the researchers were able to compile a database of 43,162 genes—21,306 of which coded for proteins, and 21,856 were noncoding genes.

According to Nature, this dataset includes many more genes than currently existing datasets. For example, the GENCODE gene set, a widely used human gene database run by the European Bioinformatic Institute (EBI) in the U.K., includes 19,901 protein-coding genes and 15,779 noncoding ones.

Some scientists say more evidence is required to verify that that the new gene list is accurate. For example, Adam Frankish, a computational biologist at the EBI involved in the GENCODE project who was not involved in the study, tells Nature that after carefully analyzing about 100 of the newly identified protein-coding genes, he and his colleagues found that only one of those seems to truly code for protein.

Salzberg tells Nature that having an accurate gene count is important, because uncounted genes are frequently ignored—meaning those containing disease-causing mutations may be overlooked. On the other hand, Frankish tells Nature that hastily adding genes could also be problematic, because they may divert scientists’ attention away from the genes that are actually involved in a disease.

September 2018

The Muscle Issue

The dynamic tissue reveals its secrets

Marketplace

Sponsored Product Updates

StemExpress LeukopakâNow Available in Frozen Format

StemExpress LeukopakâNow Available in Frozen Format

StemExpress, a Folsom, California based leading supplier of human biospecimens, announces the release of frozen Peripheral Blood Leukopaks. Leukopaks provide an enriched source of peripheral blood mononuclear cells (PBMCs) with low granulocyte and red blood cells that can be used in a variety of downstream cell-based applications.

New Antifade Mounting Media from Vector Laboratories Enhances Immunofluorescence Applications

New Antifade Mounting Media from Vector Laboratories Enhances Immunofluorescence Applications

Vector Laboratories, a leader in the development and manufacture of labeling and detection reagents for biomedical research, introduces VECTASHIELD® Vibrance™ – antifade mounting media that delivers significant improvements to the immunofluorescence workflow.

Best Practices for Sample Preparation and Lipid Extraction from Various Samples

Best Practices for Sample Preparation and Lipid Extraction from Various Samples

Download this white paper from Bertin Technologies to learn how to extract and analyze lipid samples from various models!

Bio-Rad Launches CHT Ceramic Hydroxyapatite XT Media and Nuvia HP-Q Resin for Process Protein Purification

Bio-Rad Launches CHT Ceramic Hydroxyapatite XT Media and Nuvia HP-Q Resin for Process Protein Purification

Bio-Rad Laboratories, Inc. (NYSE: BIO and BIOb), a global leader of life science research and clinical diagnostic products, today announced the launch of two new chromatography media for process protein purification: CHT Ceramic Hydroxyapatite XT Media and Nuvia HP-Q Resin.