PETER WELINDER/SIMON WARBY; ALMONROTH/WIKIMEDIA COMMONS
Unlike many other physiological measurements that have been relegated to automation, the detection of the more subtle stages of sleep has remained within the purview of experts in sleep clinics. But relying on humans for data processing is tedious, costly, and vulnerable to subjectivity, so researchers have been developing a number of automated methods to single out patterns of brain waves in EEG recordings. Emmanuel Mignot, director of the Stanford Center for Sleep Sciences and Medicine, wanted to see how those automated methods hold up against the detection abilities of both experts and nonexperts viewing the same EEG readings.
Of particular interest to Mignot are sleep spindles, small bursts of activity characteristic of stage-2 sleep, a 20-minute period in which body temperature drops and the heart rate begins to slow. Spindles last only about a half second and occur roughly twice a minute. Mignot’s team asked 24 sleep experts and 114 nonexperts to pick out sleep spindles from the EEG data of 110 sleepers. Then the researchers compared the accuracy of the nonexpert crowd and of six automated spindle-detection algorithms with the accuracy of the experts. Against a gold standard defined by the consensus of the group of experts—an accuracy score of 1—the best computer algorithm only reached a score of 0.53 (Nat Meth, 11:385-92, 2014). “The computer programs were horrible,” Mignot says. “They were just very bad.”
The nonexperts performed considerably better than the automated methods, reaching an overall accuracy score of 0.67, which was close to the average performance of any individual sleep expert. But the crowd far exceeded the experts in speed, analyzing twice as much data in three days as the experts processed in 10 months. “It’s pretty good,” Mignot says. “Of course, you have to use more nonexperts to come up with a consensus that’s as good as the experts’, but definitely you can do almost as well, and way, way better than any computer program.”
Scientists have used crowdsourcing to help them answer a variety of biological questions. (See “Games for Science,” The Scientist, January 2013.) And in those cases when the performance of the crowd is measured against that of computers, the crowd often does better. Consider EteRNA, a wildly popular “open laboratory” in which tens of thousands of participants solve RNA structure puzzles. The players design RNA sequences intended to fold into a specific shape, and each week Rhiju Das’s lab at Stanford University chooses several of them to test in real life and see if the sequence will actually build the target structure. “It was totally a risk,” Das says. “I didn’t know whether players would be able to analyze previous experimental data and then design experiments to test their ideas.” After a couple of months of these weekly experimental cycles, the humans were doing as well as the computer algorithms. By five or six months, the crowd’s median performance was better than the best a computer could do, says Das (PNAS, doi:10.1073/pnas.1313039111, 2014).
Humans are amazing pattern-recognition machines, and no system has come close to matching our general ability to recognize patterns, find signals in noise, and to learn to do so in new kinds of domains.—Josh Tenenbaum, MIT
Josh Tenenbaum, a computational cognitive scientist at MIT, has a clue as to why nonexperts can outperform our best automated methods. “Humans are amazing pattern-recognition machines, and no system has come close to matching our general ability to recognize patterns, find signals in noise, and to learn to do so in new kinds of domains,” says Tenenbaum, who studies human cognition to design machines that can learn as well as we do.
Tenenbaum says that humans are particularly good at seeing the big, coherent picture as well as the small patterns within. Computer systems build a view from the bottom up—recognizing patterns in small patches of dimensions like space or time; compiling these features and weighting them appropriately; and then building them into a bigger and bigger understanding. “But as you go to larger and larger spatial and temporal scales [the abilities of computers to form a coherent global understanding] tend to break down,” says Tenenbaum. Humans, however, can analyze data from the bottom up in tandem with a top-down view of the larger pattern as a whole. “We can see the forest for the trees.”
Matching humans’ visual abilities is also tough. “We kind of take for granted how good the visual system is,” says the University of Maryland’s David Jacobs, who studies how to program computers to recognize objects within images. “There are a lot of things that pop out at us visually that are difficult to automate.”
Humans have an evolutionary advantage in this competition against machines. Tenenbaum is optimistic, however, that computers can catch up in the types of pattern recognition and problem solving that scientists like Das and Mignot are asking them to do. “The more we can take inspiration from the ways humans do it, that’s going to be a good thing,” says Tenenbaum. “It’s certainly possible that when we understand why humans are so good at pattern recognition, we could build better systems.”
Jérôme Waldispühl, a researcher at McGill University who developed a DNA sequence-alignment video game called Phylo, says directly pitting computers against the crowd is misguided. Rather, having them join forces may be the most powerful way to advance science. “A human alone and computer alone won’t perform as well as humans and computers together,” he says.