DANIEL MARBACH AND VALDO PEIXOTO
Statistician Liuxia Wang is in the business of prediction. At Sentrana Inc., a Washington, D.C.-based sales and marketing company, she uses data on consumers’ spending behaviors to forecast future trends in the food-service industry. But she never imagined her analytical expertise could be put to use helping patients suffering from amyotrophic lateral sclerosis (ALS), the fatal neurodegenerative disease commonly known as Lou Gehrig’s disease.
“At the end of the day, data is data, whether you’re using it to find the optimal price for certain products or estimating a disease’s course,” says Wang. “But it’s nice to be able to use predictive analytics to benefit patients.”
Working under the team name “Sentrana,” Wang and her colleague Guang “Eric” Li, a quantitative modeler, beat a global community of experts in statistics, machine learning, and computational biology to win the ALS Prediction Prize. Hosted by the ALS-focused Prize4Life and the cross-institutional collaboration DREAM (Dialogue for Reverse Engineering Assessments and Methods), the crowdsourced event challenged number crunchers to develop a computer algorithm to predict ALS disease progression using clinical-trial data from the Pooled Resource Open-Access ALS Clinical Trials (PRO-ACT) database.
Team Sentrana shared first prize with a separate team comprising Lester Mackey, a Stanford University postdoc, and Lilly Fang, a lawyer. The two teams each won $20,000, and a runner-up—University of Zurich statistician Torsten Hothorn—received $10,000.
Overall, the two top algorithms predicted disease progression better than both a baseline model and a model developed by a dozen ALS clinicians—improvements that challenge organizers estimate could allow a drug sponsor to reduce the size of a Phase 3 clinical trial by at least 20 percent, saving as much as $6 million.
Each year, about 5,000 people in the U.S. are diagnosed with ALS, with most succumbing to the disease within three to five years. But some people with ALS, such as baseball great Lou Gehrig, live for less than three years, while others, such as physicist Stephen Hawking, live for decades.
At the end of the day, data is data, whether you’re using it to find the optimal price for certain products or estimating a disease’s course.—Liuxia Wang, Sentrana Inc.
“When I see patients, the number one question I get is, ‘What is going to happen to me?’ Unfortunately, we really aren’t able to predict that,” says Stephen Kolb, Director of the ALS/Motor Neuron Disease Clinic at Ohio State University Wexner Medical Center.
The disease’s extraordinary variability doesn’t just leave patients in the dark, says Neta Zach, Prize4Life’s chief scientific officer, it also presents “massive challenges” in developing new treatments for the disease, as drug trials need large cohorts of ALS patients to detect significant therapeutic effects. If the ALS Prediction Challenge can inspire more accurate methods of estimating disease progression, clinical trial design and execution could be improved, increasing the likelihood of bringing new treatments to market.
“This knowledge will also help us have more-meaningful discussions with patients and their families about disease management and end-of-life care,” Kolb says.
As is often the case in computational challenges, the majority of the ALS Prediction Prize competitors had zero previous experience with the disease at hand.
“My client’s son had recently been diagnosed with ALS, and she asked me if the analytics we used to develop commercial solutions could be used to make progress in the disease,” Wang says. “I thought that I should try to do something to help.”
IBM researcher Gustavo Stolovitzky, who founded and heads up DREAM and has run more than 30 DREAM challenges, says the ALS Prediction Prize’s ability to attract a diverse group is one of its greatest strengths. “The researchers who have the best answer to a given scientific problem may not be the same researchers that generated the data or formulated the scientific questions,” he says. “Why not ask everybody who is ready to roll up their sleeves, even if they come from other fields?”
More than 1,000 solvers competed for the ALS Prediction Prize. They were given access to clinical data from 1,822 ALS patients in the PRO-ACT database collected during the three months following their diagnoses and asked to develop and train algorithms that predicted the condition of each patient nine months later according to the ALSFRS, a standard functional scale that measures patients’ abilities to move, care for themselves, speak, and breathe. A total of 37 algorithms were submitted.
InnoCentive, an organization that hosts crowdsourced competitions, evaluated the algorithms by comparing the predicted with the actual state of each patient. The algorithms were then assessed using a third, fully blinded and previously unseen validation set. The winning models were selected according to performance on this validation set.
The Sentrana team’s algorithm showed that the speed of decline or stability of face-related ALSFRS scores, which include tests of speech and swallowing, predicted overall disease trajectory. The algorithm also showed that a decline in breathing-related scores signaled the end stage of the disease. The Stanford team’s algorithm found that the elapsed time from disease onset to when patients entered clinical trials best predicted the nine-month trajectory of ALS, and that past rate of decline on the ALSFRS, fluctuations in body weight, and speaking ability were also highly predictive of disease progression.
Prize4Life CSO Zach says the two winning algorithms, which were recently published in Nature Biotechnology (33:51-57, 2015), “were essentially identical in predictiveness.”
The challenge also identified several potential nonstandard predictors of disease progression—uric acid, creatinine, and, surprisingly, blood pressure—that may shed light on ALS pathobiology. These findings compelled challenge organizers to host a second ALS Prize4Life challenge—the ALS Stratification Challenge—that will ask solvers to develop computer algorithms that classify 9,000 ALS patients according to disease characteristics.
The ALS Stratification Challenge is hosted by Prize4Life, DREAM, and Sage Bionetworks, which merged its open-science efforts with DREAM in 2013. The new challenge will begin registration this May and launch in June. In addition to crowdsourcing a community of solvers, the collaborators are crowdfunding donations for the prizes that will be awarded to winners of the competition.