Turning Data into Discovery

To make the most of the current data deluge, we must reward interdisciplinary researchers who identify and apply the most appropriate analysis methods.

Jun 1, 2015
Vicki Chandler

© GARY WATERS/IKON IMAGES/CORBISAs recently as five years ago, nearly all my genetics data could fit on my personal computer and could be analyzed using basic spreadsheet software. Today, my data require sophisticated analysis tools and larger storage solutions, and I am not alone. Scientists across nearly every field—from genetics to neuroscience, physics to ecology—are generating unprecedented volumes of data at speeds that would have seemed like science fiction just a few years ago. For the first time in history, researchers routinely gather more information than they can analyze in a meaningful way. As a result, science is now data-rich but discovery-poor.

The solution to this modern paradox lies in developing technologies to extract meaning from all this information. But new technologies are not sufficient; researchers must also know which technologies and methods are best equipped to address the important questions for their area of study. There simply aren’t enough academic researchers who are capable of harnessing their data deluge.

How can we attract and train the experts needed to transform data into discovery across many scientific fields? This is a problem my colleague Chris Mentzel, program director of the Gordon and Betty Moore Foundation’s Data-Driven Discovery Initiative, and I have been thinking about for several years. One of the challenges facing scientific progress is that career advancement has traditionally been fueled by specialization, individual discovery, and publication in high-impact journals. People capable of solving today’s—and tomorrow’s—data problems don’t fit that traditional model. They are truly interdisciplinary and work at the intersection of computer science, statistics, mathematics, and their discipline of interest. As such, they are often overlooked in the traditional academic hierarchy.

We believe there is an urgent need to cultivate this new type of data-driven researcher by recognizing and rewarding those with the necessary skill set. To this end, in October 2014 the foundation announced 14 recipients of the Moore Investigator in Data-Driven Discovery Awards. In addition to advancing data-driven science, we hope these five-year awards will strengthen incentives at research institutions to support more data-driven researchers by highlighting the value of these types of scientists in academia.

Fostering these interdisciplinary researchers also requires creating supportive and collaborative environments within academic institutions. Last year, the Data-Driven Discovery Initiative announced a five-year, $37.8 million partnership with the Sloan Foundation and three universities (the University of California, Berkeley; the University of Washington; and New York University) to build environments that will create homes for academic data scientists across campuses. Many other universities are also starting to invest in data-science centers with similar goals.

Measuring the success of such programs depends on the definition of academic achievement. Traditional metrics, such as the quality and quantity of papers published, are insufficient. For people working to turn data into discovery, success often means developing and sharing new tools, methodologies, and practices that can enable answering research questions at a scale not previously possible. These research outputs can have substantial scientific impact, but are often not rewarded within our current academic culture.

We live in an era of amazing technologies capable of gathering and analyzing astounding amounts of information. We can now sequence more than five whole genomes in a single day. Our telescopes capture ultrahigh-resolution images of the stars every 10 seconds, streaming hundreds of terabytes of data daily. New sensors are capable of capturing vastly complex information that may help predict changes to Earth’s ecosystems. If we invest even a fraction as much in the development of data scientists as we do in the development of new technologies and generation of data, science will one day be as rich in discovery as it is in data.

Vicki Chandler is the chief program officer in science at the Gordon and Betty Moore Foundation.