A Systematic Approach to Finding Unannotated Proteins

A study suggests that there is more to the eukaryotic genome than was previously suspected.

By Katarina Zimmer | March 1, 2018

UNEARTHED TREASURE: Confocal microscopy image of a previously unannotated mitochondrial protein, altMiD51 (green), alongside mitochondria (red) ANNIE ROY


S. Samandi et al., “Deep transcriptome annotation enables the discovery and functional characterization of cryptic small proteins,” eLife, 6:e27860, 2017.

For many years, scientists believed that each eukaryotic gene encoded just one protein and its isoforms, and researchers annotated genomes accordingly. But recent research has shown that individual genes can encode multiple different proteins, and that plenty of proteins arise from regions of the genome that are considered noncoding. Xavier Roucou, a biochemist at the University of Sherbrooke in Quebec, Canada, decided to take a systematic approach to annotating these undocumented proteins.  

To detect regions of the genome that might encode these proteins—so-called “alternative open reading frames” (altORFs)—Roucou and colleagues scanned nine eukaryotic genomes, including the human genome, for translation initiation sites and stop codons. They then translated these in silico to predict the corresponding proteins, ending up with 183,191 possible unannotated proteins in the human transcriptome alone. Many of these had orthologs in the genomes of other species examined, and appeared to have functional domains.  

To estimate how many of the putative alternative proteins are expressed in humans, the researchers searched in proteomics data collected from human samples in other studies, and detected nearly 5,000 of them. For Roucou, the results suggest that the genome harbors many overlooked proteins. “We cannot ignore them anymore,” he says.

Judith Steen, a neurologist at Harvard Medical School, finds the results intriguing. However, she notes that it’s still unknown how many of the predicted proteins are actively translated in vivo, under what circumstances, and what role they play. “From my perspective, a lot of work needs to be done,” she says. “These are baby steps.”

Update (March 5): The original version of this article mentioned scanning genomes for transcription initiation sites; in fact, they were scanned for translation initiation sites. The Scientist regrets the error.

Add a Comment

Avatar of: You



Sign In with your LabX Media Group Passport to leave a comment

Not a member? Register Now!

LabX Media Group Passport Logo


March 5, 2018

Thank you for this article. There's a tiny mistake though: in the "treasure hunt" paragraph, it should read "...scanned nine eukaryotic genomes [...] for translation initiation sites and stop codons" rather than "transcription initiation sites and stop codons".

Avatar of: Shawna


Posts: 113

Replied to a comment from Benoit Vanderperre made on March 5, 2018

March 5, 2018

Thank you for letting us know. The error has been corrected.

Avatar of: Annie Roy

Annie Roy

Posts: 1

March 5, 2018

I'm very happy about this article too! I would like to point out that it's Annie Roy instead of Anne Roy for the reference of the picture. Thank you :)

Avatar of: Shawna


Posts: 113

Replied to a comment from Annie Roy made on March 5, 2018

March 5, 2018

Sorry about that! It's been fixed.

Popular Now

  1. Two University of Rochester Professors Resign in Protest
  2. Dartmouth Professor Investigated for Sexual Misconduct Retires
  3. Theranos Leaders Indicted For Fraud
    The Nutshell Theranos Leaders Indicted For Fraud

    Federal prosecutors filed criminal charges that allege the company’s promise to revolutionize blood testing swindled investors out of hundreds of millions of dollars and put patients in danger.

  4. Koko the Signing Gorilla Dies at 46