Menu

New Database Expands Number of Estimated Human Protein-Coding Genes

Some scientists are not yet convinced that the list is accurate.

Jun 19, 2018
Diana Kwon

ISTOCK, BLACKJACK3D

The human genome may contain more protein-coding genes than prior analyses suggested. A study published last month (May 29) on BioRxiv provides an expanded database of approximately 5,000 novel genes—of those, around 1,000 code for proteins, expanding the estimated number of protein-coding genes from around 20,000 to 21,000.

“If people like our gene list, then maybe a couple years from now we’ll be the arbiter of human genes,” study coauthor Steven Salzberg, a computational biologist at Johns Hopkins University, tells Nature.

Salzberg and his colleagues compiled a catalog of human genes and transcripts using data from the Genotype-Tissue Expression (GTEx) project, in which scientists sequenced the RNA from various tissues in hundreds of human subjects. By comparing the sequenced RNA to the human genome, the researchers were able to compile a database of 43,162 genes—21,306 of which coded for proteins, and 21,856 were noncoding genes.

According to Nature, this dataset includes many more genes than currently existing datasets. For example, the GENCODE gene set, a widely used human gene database run by the European Bioinformatic Institute (EBI) in the U.K., includes 19,901 protein-coding genes and 15,779 noncoding ones.

Some scientists say more evidence is required to verify that that the new gene list is accurate. For example, Adam Frankish, a computational biologist at the EBI involved in the GENCODE project who was not involved in the study, tells Nature that after carefully analyzing about 100 of the newly identified protein-coding genes, he and his colleagues found that only one of those seems to truly code for protein.

Salzberg tells Nature that having an accurate gene count is important, because uncounted genes are frequently ignored—meaning those containing disease-causing mutations may be overlooked. On the other hand, Frankish tells Nature that hastily adding genes could also be problematic, because they may divert scientists’ attention away from the genes that are actually involved in a disease.

January 2019

Cannabis on Board

Research suggests ill effects of cannabinoids in the womb

Marketplace

Sponsored Product Updates

WIN a VIAFLO 96/384 to supercharge your microplate pipetting!
WIN a VIAFLO 96/384 to supercharge your microplate pipetting!
INTEGRA Biosciences is offering labs the chance to win a VIAFLO 96/384 pipette. Designed to simplify plate replication, plate reformatting or reservoir-to-plate transfers, the VIAFLO 96/384 allows labs without the space or budget for an expensive pipetting robot to increase the speed and throughput of routine tasks.
FORMULATRIX® digital PCR technology to be acquired by QIAGEN
FORMULATRIX® digital PCR technology to be acquired by QIAGEN
FORMULATRIX has announced that their digital PCR assets, including the CONSTELLATION® series of instruments, is being acquired by QIAGEN N.V. (NYSE: QGEN, Frankfurt Stock Exchange: QIA) for up to $260 million ($125 million upfront payment and $135 million of milestones).  QIAGEN has announced plans for a global launch in 2020 of a new series of digital PCR platforms that utilize the advanced dPCR technology developed by FORMULATRIX combined with QIAGEN’s expertise in assay development and automation.
Application of CRISPR/Cas to the Generation of Genetically Engineered Mice
Application of CRISPR/Cas to the Generation of Genetically Engineered Mice
With this application note from Taconic, learn about the power that the CRISPR/Cas system has to revolutionize the field of custom mouse model generation!
Translational Models of Obesity, Dysmetabolism, Diabetes, and Complications
Translational Models of Obesity, Dysmetabolism, Diabetes, and Complications
This webinar, from Crown Bioscience, presents a unique continuum of translational dysmetabolic platforms that more closely mimic human disease. Learn about using next-generation rodent and spontaneously diabetic non-human primate models to accurately model human-relevant disease progression and complications related to obesity and diabetes here!