NIH Grant Reviews Don’t Predict Success

Peer reviewers’ assessments of funding proposals to the National Institutes of Health don’t correlate well with later publication citations, a study shows.

Feb 18, 2016
Kerry Grens

FLICKR, QUINN DOMBROWSKIJudgments made by peer reviewers about the merit of grant proposals submitted to the National Institutes of Health (NIH) aren’t predictive of the subsequent productivity of the research. That’s the conclusion of a recent study, published in eLife February 16, which analyzed the number of citations and publications resulting from funded projects.

“The excellent productivity exhibited by many projects with relatively poor scores and the poor productivity exhibited by some projects with outstanding scores demonstrate the inherent unpredictability of scientific research,” Ferric Fang of the University of Washington School of Medicine and coauthors wrote in their report.

Studies on the predictive power of peer-review panels have yielded a mixed bag of results—some show they are not very good at deciding which projects will be the most productive. Yet a recent review of more than 130,000 grant proposals found that peer reviewers’ high scores correlated well with research projects that would end up with the most publications, citations, and patents.

That study included all proposals (both the high- and low-ranked), but Fang’s team wanted to look only at meritorious applications. So the researchers selected about 103,000 grants—those in the top 20th percentile in score or above (most funded grants sit within the top 10 percent).

They found that the number of publications and the citations related to those publications varied widely at each percentile, although those in the 2nd percentile and above yielded more citations than lower-ranked proposals.

“While our re-analysis confirms that there is a correlation between percentile score and publication or citation productivity for applications with scores in the top 20 percentiles, the correlation is quite modest . . . suggesting that the overall ability of review groups to predict application success is weak at best,” Fang’s team wrote.

“When people’s opinions count a lot, we may be doing worse than choosing at random,” study coauthor Arturo Casadevall of the Johns Hopkins Bloomberg School of Public Health said in a press release. “A negative word at the table can often swing the debate. And this is how we allocate research funding in this country.”

The researchers propose a system in which projects ranked highly by peer reviewers enter a lottery for funding.