A large undertaking to reproduce 21 psychology studies published in Nature and Science came to the same conclusion as the original papers 62 percent of the time, according to a report in Nature Human Behavior today (August 27). When the results of the original and replication experiments agreed, the effect sizes were smaller the second time around, indicating to the authors that both false positives and “inflated effect sizes” are part of the reproducibility problem in the field.
“A false positive result can make other researchers, and the original researcher, spend lots of time and energy and money on results that turn out not to hold,” study coauthor Anna Dreber, an economics professor at the Stockholm School of Economics, tells NPR. “And that’s kind of wasteful for resources and inefficient, so the sooner we find out that a result doesn’t hold, the better.”
In addition...
“I did a sniff test of whether the results actually make sense,” Paul Smeets, a participant from Maastricht University, tells The Atlantic. “Some results look quite spectacular but also seem a bit too good to be true, which usually that means they are.”
See “Studies Unable to Reproduce Results of Two Diabetes Papers”
The 21 original papers were published between 2010 and 2015 and chosen from Nature and Science because of the journals’ prestige. All of the authors were alerted to the replication project and asked to give feedback. Among them is Will Gervais, a psychology professor at the University of Kentucky whose 2012 study examined participants’ religiosity before and after looking at Rodin’s sculpture “The Thinker.” It didn’t replicate, just as the prediction study expected. “Our study in hindsight was outright silly,” Gervais tells The Washington Post.
“We should not treat publication in Science or Nature to be a mark of a particularly robust finding or a particularly skilled researcher,” Simine Vazire, a psychologist at the University of California, Davis, who was not involved in the study, tells The Atlantic. These journals “are not especially good at picking out really robust findings or excellent research practices. And the prediction market adds to my frustration because it shows that there are clues to the strength of the evidence in the papers themselves.”