Menu

Computers That Can Smell

Teams of modelers compete to develop algorithms for estimating how people will perceive a particular odor from its molecular characteristics.

May 1, 2017
Kerry Grens

ANDRZEJ KRAUZE

In June of 2014, Pablo Meyer went to Rockefeller University in New York City to give a talk about open data. He leads the Translational Systems Biology and Nanobiotechnology group at IBM Research and also guides so-called DREAM challenges, or Dialogue for Reverse Engineering Assessments and Methods. These projects crowdsource the development of algorithms from open data to make predictions for all manner of medical and biological problems—for example, prostate cancer survival or how quickly ALS patients’ symptoms will progress. Andreas Keller, a neuroscientist at Rockefeller, was in the audience that day, and afterward he emailed Meyer with an offer and a request. “He said, ‘We have this data set, and we don’t model,’” recalls Meyer. “‘Do you think you could organize a competition?’”

The data set Keller had been building was far from ordinary. It was the largest collection of odor perceptions of its kind—dozens of volunteers, each having made 10 visits to the lab, described 476 different smells using 19 descriptive words (including sweet, urinous, sweaty, and warm), along with the pleasantness and intensity of the scent. Before Keller’s database, the go-to catalog at researchers’ disposal was a list of 10 odor compounds, described by 150 participants using 146 words, which had been developed by pioneering olfaction scientist Andrew Dravnieks more than three decades earlier.

Meyer was intrigued, so he asked Keller for the data. Before launching a DREAM challenge, Meyer has to ensure that the raw data provided to competitors do indeed reflect some biological phenomenon. In this case, he needed to be sure that algorithms could determine what a molecule might smell like when only its chemical characteristics were fed in. There were more than 4,800 molecular features for each compound, including structural properties, functional groups, chemical compositions, and the like. “We developed a simple linear model just to see if there’s a signal there,” Meyer says. “We were very, very surprised we got a result. We thought there was a bug.”

In January 2015, the call went out to modelers to join a competition for designing the best model from data on 69 odors to predict their scent profiles. Eighteen teams submitted algorithms. They performed fairly well at estimating the presence of certain qualities in an odor—garlicky, fishy, sweet, or burnt, for example—and especially well at predicting how intense or pleasant a smell would be. “It’s a very impressive effort to collect this much data, and it allowed them to model responses and descriptors better than has been done before,” says Kobi Snitz, a modeling specialist in Noam Sobel’s olfaction research group at the Weizmann Institute of Science in Rehovot, Israel, who did not participate in the competition.

See "May the Best Model Win" to read more about DREAM challenges.

One of the results that surprised Meyer most was the second-place performance of a linear model. That algorithm took different parts of each molecule and generated predictions of how each bit would smell—one part might evoke a bakery, for instance, and another, grass. Meyer speculates that this may reflect something fundamental about olfaction and the way odors interact with receptors. Rather than an entire molecule matching a distinct receptor, perhaps it interacts with numerous receptors, with each responding to these various molecular subunits.

Although his data set contained thousands of molecular features, Keller says very few were required to describe each molecule’s smell. “If you know the features of the molecule that make something smell like garlic, you can look at those few and have a pretty good prediction,” he says. “A nice step would be to see how that relates to the binding of odor molecules to odor receptors. If you only have a few features that are important, it becomes a more tractable problem.”

Keller says there’s no consensus in the olfaction field about how the sense works. “The basic science issue is we really have no idea what’s in the odor that makes us [perceive a certain smell],” agrees Johan Lundström, who leads olfactory research groups at the Karolinska Institute in Stockholm and the Monell Chemical Senses Center in Philadelphia. Keller’s database could offer some insight as researchers continue to probe it (the team has made it publicly available), but there’s a limitation: it only includes pure odors, rather than mixtures. “Most odors are not monomolecular,” says Lundström. “Ninety-nine-point-nine percent are complicated mixtures that consist of anywhere from two to 500 different chemicals.”

Several years ago, Snitz and colleagues developed an algorithm to predict the similarity of certain odor mixtures (PLOS Comp Biol, 9:e1003184, 2013). “It turned out that the model works better when mixtures are represented as a single entity rather than as a collection of distinct components,” Snitz says.

Keller is already working on a data set of odor mixtures, using an approach similar to Snitz’s study, but asking study subjects to rate similarities between different smells, rather than to use semantic descriptors. Until this collection is ready, researchers can play around with the data set used in the DREAM challenge. And for eager modelers, Meyer and other DREAM leaders create new challenges every six months. “It’s a very nice idea to have this kind of competition,” says Lundström. “Scientists are naturally competitive. This way you can use that competition to do something great for the community.”

September 2018

The Muscle Issue

The dynamic tissue reveals its secrets

Marketplace

Sponsored Product Updates

StemExpress LeukopakâNow Available in Frozen Format

StemExpress LeukopakâNow Available in Frozen Format

StemExpress, a Folsom, California based leading supplier of human biospecimens, announces the release of frozen Peripheral Blood Leukopaks. Leukopaks provide an enriched source of peripheral blood mononuclear cells (PBMCs) with low granulocyte and red blood cells that can be used in a variety of downstream cell-based applications.

New Antifade Mounting Media from Vector Laboratories Enhances Immunofluorescence Applications

New Antifade Mounting Media from Vector Laboratories Enhances Immunofluorescence Applications

Vector Laboratories, a leader in the development and manufacture of labeling and detection reagents for biomedical research, introduces VECTASHIELD® Vibrance™ – antifade mounting media that delivers significant improvements to the immunofluorescence workflow.

Best Practices for Sample Preparation and Lipid Extraction from Various Samples

Best Practices for Sample Preparation and Lipid Extraction from Various Samples

Download this white paper from Bertin Technologies to learn how to extract and analyze lipid samples from various models!

Bio-Rad Launches CHT Ceramic Hydroxyapatite XT Media and Nuvia HP-Q Resin for Process Protein Purification

Bio-Rad Launches CHT Ceramic Hydroxyapatite XT Media and Nuvia HP-Q Resin for Process Protein Purification

Bio-Rad Laboratories, Inc. (NYSE: BIO and BIOb), a global leader of life science research and clinical diagnostic products, today announced the launch of two new chromatography media for process protein purification: CHT Ceramic Hydroxyapatite XT Media and Nuvia HP-Q Resin.