A Systematic Approach to Finding Unannotated Proteins

A study suggests that there is more to the eukaryotic genome than was previously suspected.

katya katarina zimmer
Katarina Zimmer

After a year teaching an algorithm to differentiate between the echolocation calls of different bat species, Katarina decided she was simply too greedy to focus on one field. Following an internship with The Scientist in 2017, she has been happily freelancing for a number of publications, covering everything from climate change to oncology.

View full profile.

Learn about our editorial policies.

UNEARTHED TREASURE: Confocal microscopy image of a previously unannotated mitochondrial protein, altMiD51 (green), alongside mitochondria (red) ANNIE ROY


S. Samandi et al., “Deep transcriptome annotation enables the discovery and functional characterization of cryptic small proteins,” eLife, 6:e27860, 2017.

For many years, scientists believed that each eukaryotic gene encoded just one protein and its isoforms, and researchers annotated genomes accordingly. But recent research has shown that individual genes can encode multiple different proteins, and that plenty of proteins arise from regions of the genome that are considered noncoding. Xavier Roucou, a biochemist at the University of Sherbrooke in Quebec, Canada, decided to take a systematic approach to annotating these undocumented proteins.  

To detect regions of the genome that might encode these proteins—so-called “alternative open reading frames” (altORFs)—Roucou and colleagues scanned nine eukaryotic genomes, including the human...

Update (March 5): The original version of this article mentioned scanning genomes for transcription initiation sites; in fact, they were scanned for translation initiation sites. The Scientist regrets the error.

Interested in reading more?

Magaizne Cover

Become a Member of

Receive full access to digital editions of The Scientist, as well as TS Digest, feature stories, more than 35 years of archives, and much more!
Already a member?