Menu

A Systematic Approach to Finding Unannotated Proteins

A study suggests that there is more to the eukaryotic genome than was previously suspected.

Mar 1, 2018
Katarina Zimmer

UNEARTHED TREASURE: Confocal microscopy image of a previously unannotated mitochondrial protein, altMiD51 (green), alongside mitochondria (red) ANNIE ROY

EDITOR'S CHOICE IN GENETICS & GENOMICS

THE PAPER
S. Samandi et al., “Deep transcriptome annotation enables the discovery and functional characterization of cryptic small proteins,” eLife, 6:e27860, 2017.

HIDDEN GEMS
For many years, scientists believed that each eukaryotic gene encoded just one protein and its isoforms, and researchers annotated genomes accordingly. But recent research has shown that individual genes can encode multiple different proteins, and that plenty of proteins arise from regions of the genome that are considered noncoding. Xavier Roucou, a biochemist at the University of Sherbrooke in Quebec, Canada, decided to take a systematic approach to annotating these undocumented proteins.  

TREASURE HUNT
To detect regions of the genome that might encode these proteins—so-called “alternative open reading frames” (altORFs)—Roucou and colleagues scanned nine eukaryotic genomes, including the human genome, for translation initiation sites and stop codons. They then translated these in silico to predict the corresponding proteins, ending up with 183,191 possible unannotated proteins in the human transcriptome alone. Many of these had orthologs in the genomes of other species examined, and appeared to have functional domains.  

ELUSIVE PROTEINS
To estimate how many of the putative alternative proteins are expressed in humans, the researchers searched in proteomics data collected from human samples in other studies, and detected nearly 5,000 of them. For Roucou, the results suggest that the genome harbors many overlooked proteins. “We cannot ignore them anymore,” he says.

JUST THE BEGINNING
Judith Steen, a neurologist at Harvard Medical School, finds the results intriguing. However, she notes that it’s still unknown how many of the predicted proteins are actively translated in vivo, under what circumstances, and what role they play. “From my perspective, a lot of work needs to be done,” she says. “These are baby steps.”

Update (March 5): The original version of this article mentioned scanning genomes for transcription initiation sites; in fact, they were scanned for translation initiation sites. The Scientist regrets the error.

April 2019

Will Car T Cells Smash Tumors?

New trials take the therapy beyond the blood

Marketplace

Sponsored Product Updates

Myth Busting: The Best Way to Use Pure Water in the Lab
Myth Busting: The Best Way to Use Pure Water in the Lab
Download this white paper from ELGA LabWater to learn about the role of pure water in the laboratory and the advantages of in-house water purification!
Shimadzu's New Nexera UHPLC Series with AI and IoT Enhancements Sets Industry Standard for Intelligence, Efficiency and Design
Shimadzu's New Nexera UHPLC Series with AI and IoT Enhancements Sets Industry Standard for Intelligence, Efficiency and Design
Shimadzu Corporation announces the release of the Nexera Ultra High-Performance Liquid Chromatograph series, incorporating artificial intelligence as Analytical Intelligence, allowing systems to detect and resolve issues automatically. The Nexera series makes lab management simple by integrating IoT and device networking, enabling users to easily review instrument status, optimize resource allocation, and achieve higher throughput.
IDT lowers genomic barriers with powerful rhAmpSeq™ targeted sequencing system
IDT lowers genomic barriers with powerful rhAmpSeq™ targeted sequencing system
Increasing accuracy and reducing cost barriers, IDT’s innovative system delivers simple and cost-effective amplicon sequencing
Bio-Rad Introduces Isotype-Specific Secondary Antibodies
Bio-Rad Introduces Isotype-Specific Secondary Antibodies
Bio-Rad Laboratories, Inc. (NYSE: BIO and BIOb), a global leader of life science research and clinical diagnostic products, today announced the launch of its isotype-specific secondary antibodies. This new range of recombinant monoclonal antibodies, directed against the three main mouse isotypes: IgG1, IgG2a, and IgG2b, offer improved signal detection and specificity in imaging, ELISA, flow cytometry, and western blotting.