When a Rose Must Be Called a Rose

Douglas Brutlag challenges students in his computational biology classes at Stanford University to search the large proteomics databases for yeast membrane proteins. Without knowledge of the database lexicons, the students generally come up well short of the mark. "They find 20 to 200," says Brutlag, professor of biochemistry and medicine at Stanford's School of Medicine. "In fact, there are almost 2,000 proteins." The problem: linguistics. "These are controlled vocabularies," Brutlag explain

| 4 min read

Register for free to listen to this article
Listen with Speechify
0:00
4:00
Share

Douglas Brutlag challenges students in his computational biology classes at Stanford University to search the large proteomics databases for yeast membrane proteins. Without knowledge of the database lexicons, the students generally come up well short of the mark. "They find 20 to 200," says Brutlag, professor of biochemistry and medicine at Stanford's School of Medicine. "In fact, there are almost 2,000 proteins."

The problem: linguistics. "These are controlled vocabularies," Brutlag explains. "The key words for membrane proteins are trans membrane, inner membrane, and outer membrane, and unless you have synonyms for all of those, you miss them when you search the data."

Multiply the graduate students' challenge in searching databases by orders of magnitude, and this represents the trials of researchers in academia and industry worldwide. There is a Babel of different computer languages, imagery systems, and software programs that use unique symbols and store their biological treasures in distinctive ...

Interested in reading more?

Become a Member of

The Scientist Logo
Receive full access to digital editions of The Scientist, as well as TS Digest, feature stories, more than 35 years of archives, and much more!
Already a member? Login Here

Meet the Author

  • Paula Park

    This person does not yet have a bio.

Published In

Share
3D illustration of a gold lipid nanoparticle with pink nucleic acid inside of it. Purple and teal spikes stick out from the lipid bilayer representing polyethylene glycol.
February 2025, Issue 1

A Nanoparticle Delivery System for Gene Therapy

A reimagined lipid vehicle for nucleic acids could overcome the limitations of current vectors.

View this Issue
Enhancing Therapeutic Antibody Discovery with Cross-Platform Workflows

Enhancing Therapeutic Antibody Discovery with Cross-Platform Workflows

sartorius logo
Considerations for Cell-Based Assays in Immuno-Oncology Research

Considerations for Cell-Based Assays in Immuno-Oncology Research

Lonza
An illustration of animal and tree silhouettes.

From Water Bears to Grizzly Bears: Unusual Animal Models

Taconic Biosciences
Sex Differences in Neurological Research

Sex Differences in Neurological Research

bit.bio logo

Products

Tecan Logo

Tecan introduces Veya: bringing digital, scalable automation to labs worldwide

Explore a Concise Guide to Optimizing Viral Transduction

A Visual Guide to Lentiviral Gene Delivery

Takara Bio
Inventia Life Science

Inventia Life Science Launches RASTRUM™ Allegro to Revolutionize High-Throughput 3D Cell Culture for Drug Discovery and Disease Research

An illustration of differently shaped viruses.

Detecting Novel Viruses Using a Comprehensive Enrichment Panel

Twist Bio