Advertisement

Tackling peer review bias

New statistical analyses of the National Institutes of Health's peer review process suggest that the current system may be missing the mark on funding the right proposals. Reviews of as many as 25% of all proposals are biased, according to a study led by linkurl:Valen Johnson,;http://gsbs.uth.tmc.edu/tutorial/johnson_v.html from MD Anderson Cancer Center to be published tomorrow (July 29) in Proceedings of the National Academy of Sciences. Johnson collected about 14,000 reviewers' scoring data

By | July 28, 2008

New statistical analyses of the National Institutes of Health's peer review process suggest that the current system may be missing the mark on funding the right proposals. Reviews of as many as 25% of all proposals are biased, according to a study led by linkurl:Valen Johnson,;http://gsbs.uth.tmc.edu/tutorial/johnson_v.html from MD Anderson Cancer Center to be published tomorrow (July 29) in Proceedings of the National Academy of Sciences. Johnson collected about 14,000 reviewers' scoring data on some 18,000 proposals from two reviewing sessions in 2005. He developed a statistical tool that analyzed how reviewers changed their score for each proposal once a study group of five or six reviewers had discussed each application. Johnson found that certain reviewers judged consistently harsher, for example, and may have influenced how the rest of the reviewing study section rates a proposal. Johnson also demonstrated that, based on reviewers' assessments, there isn't much difference in quality between proposals that scored in the low range. If that's the case, Johnson told The Scientist, the cheapest one should be funded. By factoring in cost this way, NIH could fund more proposals, he added. linkurl:Antonio Scarpa,;http://cms.csr.nih.gov/AboutCSR/Welcome+to+CSR/BioforDrAntonioScarpa.htm director of the NIH Center for Scientific Review (CSR), told The Scientist that the peer review process is more complicated than one paper can take into account, and judging proposals is like critiquing a movie after having only read a paragraph description. Last month the linkurl:NIH announced;http://www.the-scientist.com/blog/display/54733/ up-coming changes to the peer review process after last year's review of peer review. Scarpa said that the CSR is working to implement a ranking system (as opposed to an individual scoring system), and having each reviewer give their scoring criteria -- whether the reviewer values an investigator's achievements over the proposal itself, for example. This takes into account some reader bias that Johnson identified, Scarpa said. "[Johnson] does a good job of identifying the weaknesses in his own model," Andrea Kopstein, director of planning, analysis and evaluation at the CSR, told The Scientist. For example, "we really don't know the true proposal merits. And so many of the issues raised in this paper are already under study and changes are being implemented." In a linkurl:paper;http://www.plosone.org/article/fetchArticle.action?articleURI=info:doi/10.1371/journal.pone.0002761 published last week in the journal PLoS ONE, linkurl:David Kaplan,;http://path-www.path.cwru.edu/information6.php?info_id=30 at Case Western Reserve, showed that it would take more than 30,000 reviewers to make a good, unbiased assessment about one proposal. Instead of hiring tens of thousands of reviewers, an obviously impossible solution, Kaplan told The Scientist, he proposes changing the proposal grading system. The current system has 41 grades that the reviewer can assign. By only giving reviewers five grading options, for example, and by shortening the length of proposals to only a few pages, reviewers can quickly assess the value of a given application. Both analyses suggest that the NIH could save money -- and may improve proposal scoring accuracy -- by having reviewers evaluate proposals on their own rather than in study groups, thus eliminating the need for meeting and travel expenses.
Advertisement

Comments

Avatar of: David Watson

David Watson

Posts: 1

July 29, 2008

While I believe NIH may carry out normalization of reviewer scoring--can someone defnitively confirm or refute this for me?--I also know of at least one other Federal granting agency that does not. Normalization should at least somewhat soften the impact of, for want of a better term, what would be essentially an outlier in terms of scoring behavior.\n\nEven so, as a former PRA (Peer Review Administrator) and Program Officer for another Federal agency, I certainly believe that group dynamics affect panel scoring behavior (especially if the particularly harsh critic also happens to be the panel chair)--I have witnessed this phenomenon first-hand on multiple occasions. \n\nIn fact, while working as a PRA for this undisclosed agency, I endeavored to collect scoring data (sans any identifying information) from a large number of proposals across several panels (n= c.200 proposals in total, I believe). I discovered that, for a given proposal, either too short a discussion time (this agency does not "triage" proposals, as does NIH), *or* conversely, too *lenghthy* a discussion, made fuding less likely. Put another way, for a proposal to recieve a score that would likely lead to funding, the proposal in question should be discussed for no more than +/- 1 standard deviation of the mean length of discussion time for all proposals. \n\nThe take-home message here, I think, is that as discussion time goes on, reviewers, epsecially those pre-disposed not to like a proposal, find ways to convince their peers of the lack of merit of the proposal under discussion.
Avatar of: Bradley Andresen

Bradley Andresen

Posts: 34

July 29, 2008

Although I am a young scientist and have never served on a review committee, I must ask can an anonymous review by a group or individuals ever be fair? If you are known in the scientific community, which if the panel or individual is in your area of expertise as the system is designed it is likely that some of the reviewers will personally know the PI of the grant, then it will be very hard, neigh impossible, for the reviewer to remove their feelings for the person and their approach to science. In other words, friends help friends and enemies clash and trash each other?s work. There is also the bias of the ?big guy?/noble laureate in the field, and the assumption that their science is somehow better than a younger less experienced scientist. If a grant is going to be judged on scientific merit alone, then it should be graded in a double blind manner. If you are judging the characteristics (ie institution, education background, previous publications) of the PI, then the following system is designed perfectly, and there will obviously be biases that seep into the system.
Avatar of: Ellen Hunt

Ellen Hunt

Posts: 199

July 29, 2008

With grant allocations being more and more a matter of influence in congress, I have become disgusted with the self-fulfilling prophecy of these "Centers for Excellence". \n\nIt is long overdue for watchdogs like "The Scientist" to dig deep into the web of corruption that has engendered this influence-peddling scheme. \n\nI know several scientists, and I am thinking about joining them, who won't so much as consider turning in a grant to NIH/NIAID because it is a corrupt charade. Instead, the focus their energy on alternative sources and consulting to support their research. \n\nWe also need to end the HIV juggernaut that has crushed other fields with its endless appetite. And yet, there is no disease with less promise for cure or vaccine than HIV. This has made it into an endless gold-mine for scientists. Bluntly put, the majority of them piddle around spending huge amounts of money and producing next to nothing while paying themselves outsize salaries. It's a real eye-opener to realize that your typical HIV P.I. is paying themselves on the order of $150 to $200 thousand a year, while the average for scientists in microbiology and medicine is around $90 thousand a year.
Avatar of: anonymous poster

anonymous poster

Posts: 85

July 29, 2008

I've always been of the opinion that peer review of scientific proposals is intrinsically subjective. Each individual reviewer comes to the task with his or her own individual biases concerning which scientific questions are "important" vs. not, which technical approaches are appropriate vs. not, etc. Then there is the individual's reaction to the style of the proposal writer -- how much detail is too much, how much is not enough? These are all matters of individual taste, shaped through individual experience, personality, etc. Some reviewers are chauvinistic "champions" of proposals in scientific fields closely related to their own work, while others are highly competitive and criticize proposals in their own fields to death (in part perhaps because they know where the stumbling blocks, however subtle or arcane, lie). And of course, there is the individual reviewer's tolerance for "risk" (however one wishes to define riskiness, which is a whole other question) -- some reviewers can become highly enthusiastic about "risky" proposals while most are highly negative about them. Are any of these reviewers more "wrong" or "right" than others? That depends on who is doing the judging, because that itself is a highly subjective judgement call. \n\nSo, I think it is important to accept, a priori, that the very nature of peer review is one of subjectivity, and that unless there are egregious technical or logical "fatal flaws" in a proposal (and of course there are such fatally flawed proposals, and they are easy for intelligent reviewers to spot if the reviewers actually read the proposal carefully and think about what they are reading), the opinions of bright, well-intentioned reviewers will reflect the reviewers' own biases (i.e., where the reviewer is coming from, as opposed to where the proposer is coming from). If we accept this as a given, based on the fact that reviewers are human beings, then we can attempt to identify the biases and deal with them appropriately in the decision-making process (which in turn should be "biased" in terms of agency priorities). This is MORE WORK for the agency scientific staff. It requires more thinking by the staffers, more comparing between the review texts and even comparing between the reviews and the actual proposals. Agencies would have to trust their staff more than is the case presently, and of course agencies would have to choose their staff more carefully in the first place. And advisory reviewers should be just that -- advisory; they should not have veto power, numerical or otherwise, over an agency's potential funding decisions. Will we ever see such a flexible, openly-and-admittedly-biased system in which the biases are embraced as such, at agencies with entrenched bureacracies such as NIH? One can only dream. But scientific progress will be faster and more exciting if the dream comes true.\n\n
Avatar of: Ram Sihag

Ram Sihag

Posts: 12

July 29, 2008

Peer review biasness is unfortunate.Many useful projects may be side-lined purely due to this malice.To tackle this problem there shuld be two categories of the referees for each project--at least two from the subject field and other two from the general field.The negative remarks from any of the two categories be got removed in consultation with the author(s) of the project and again the same should be re-referred to the two fresh referees.Though this is a complicated and lengthy procedure,yet may remove the peer reviw biasness.
Avatar of: anonymous poster

anonymous poster

Posts: 9

July 30, 2008

The UK is almost certainly just as bad. Why do we not have two panels that are independent. Ensure this with anonymity prior to review and announce them post review on the return to applicant. Ensure establishment cartels are out by changing panels at each round.
Avatar of: anonymous poster

anonymous poster

Posts: 27

July 30, 2008

"And advisory reviewers should be just that -- advisory; they should not have veto power, numerical or otherwise, over an agency's potential funding decisions."\n\nAt NIH this is technically true. Only in very rare cases does a panel recommend a NRFC. Institutes can fund any scored or unscored application that they want they just need to justify it.\n\nThe real question is who do you want deciding what gets funded..an advisory panel of your peers or bureaucrats (Program Officers) at NIH?\n
Avatar of: anonymous poster

anonymous poster

Posts: 27

July 30, 2008

"I've always been of the opinion that peer review of scientific proposals is intrinsically subjective."\n\nAll I can say to this is Duh! Of course it is subjective. There is no objective measure of whether an applications is good or not. Someone has to make that determination.
Avatar of: anonymous poster

anonymous poster

Posts: 27

July 30, 2008

"I know several scientists, and I am thinking about joining them, who won't so much as consider turning in a grant to NIH/NIAID because it is a corrupt charade. Instead, the focus their energy on alternative sources and consulting to support their research."\n\nGo for it. I am sure NIH won't be too upset. You have no right grant dollars from the Federal gov't. I take it that "corrupt charade"= "Not scored".\n\nIf it makes you feel better to consider these dollars corrupt that is fine. \n\nI am interested to know where your non corrupt $ will come from? Consulting with Big Pharma? Perhaps a nonprofit? They all have their own agenda. \n\nThe bottom line is that there are too many scientists trying to feed from the government trough. Once the funding line moves back up to the 20th percentile (either by more $ or fewer scientists applying) all of these worries will go away.
Avatar of: anonymous poster

anonymous poster

Posts: 4

July 30, 2008

Let's face it, because of the subjective nature, all funding systems are inherently flawed. Our response is to become either 'bitter' or 'better'. I hope that most of us will choose the first option.\n
Avatar of: James Edwards

James Edwards

Posts: 1

July 30, 2008

In '"Objective peer review" is intrinsically impossible' an anonymous poster advocates for a style of peer review used by many programs at the National Science Foundation. External reviewers individually rate the proposals, followed by a panel review, with each panelist also having individually rated the proposal. At the end of the panel, the panelists rank the proposals. All these inputs are then used by the program officer in making the final funding decisions. Good program officers do, as recommended, extensively compare comments by reviewers with the proposals themselves. All the individual reviews (by both external reviewers and panelists) and a document that summarizes the panel's reasons for its recommendation are sent to the PI, along with a letter from the program officer that flags any ad hominem or other out-of-line comments that the program officer did not use in making the funding decision. This process allows in-depth review by external reviewers and comparative review by the panelists, but the final judgment comes from the program officer. At least in the biological sciences at NSF, a large percentage of the program officers are "rotators" who take a year or two away from their laboratories to work at NSF, thus ensuring new perspectives and ideas are incorporated. I have long wondered why NIH has not adopted a similar kind of peer review system, instead of relying almost solely on review panels and their infamous pay lines.
Avatar of: anonymous poster

anonymous poster

Posts: 7

July 30, 2008

As a first time reviewer for SBIR proposals for the NCI, where the reviewers only meet online, I found the process very frustrating. The scores, at times, vary widely and the general instructions were to come to a unified common score as far as possible. I found myself spending an enormous amount of time-- since this was the first time I was reviewing I felt it was my duty to do so-- to be critical, where necessary but fair. At times, I went searching deep into the literature whether the work proposed was new or not. I even found patents where the topic of the proposal I was reviewing had already been published. But, the person in-charge would not even listen when I pointed this out because other reviewers missed this information. A lot of things could have been sorted out in a face-to-face meeting but the NIH was cutting budget for travel which is really penny pinching in my opinion. It is more than worthwhile to spend this money for deciding the fate of the submitted grant proposals. Besides, I had a large number of proposals to review, far too many, I thought. This may be one of the reasons why some reviewers probably do not even read the proposals thoroughly which is nothing but unfair to the investigators.
Avatar of: anonymous poster

anonymous poster

Posts: 1

July 30, 2008

Peer reviewers are judging R01 grants submitted for renewal as "noncompetitive" even though the research results are published in prestigious journals. If the grant is triaged and never receives full discussion, how will the researcher learn what it takes to become "competitive?" Or is this just a "good old boys" tactic of protecting their own by eliminating rivals for the piece of the pie.\nUntil funding at the NIH is increased, the peer review bias will only get worse.\n\n
Avatar of: anonymous poster

anonymous poster

Posts: 27

July 30, 2008

Read the critiques that the reviewers provided you. To get triaged everyone has to agree that your app is noncompetitive. \n\nA discussion wouldn't have helped you.\n\n
Avatar of: Fred Heydrick

Fred Heydrick

Posts: 1

July 31, 2008

As a former SRO (in my day, Exec Sec) at NIH, I agree with many of the comments raised by fellow bloggers. Yes, the present model is subjective, yes the review results of committees are only as good as a) the quality and appropriateness of the scientific backgrounds of the review panel members and b) the quality of the applications that are being reviewed. Yes, scoring the science proposed in the applications by the current criteria does not blind the reviewer because of the necessity to evaluate the PI, Institution, Environment, etc. While the the carefully described Johnson report gives statistical validity to the peer review system flaws, no statistical design is presented for a brand new approach or better peer review model. Until that happens, the current model containing the "fixes" suggested by CSR is the best we have. \n\nOne aspect of the current "fixes" not presented is reviewer-applicant communication. For instance, it is taboo in NIH chartered study sections for a reviewer to contact the applicant. Thus, if an application somewhat outside the Committee's expertise is assigned to the Committee by referral officers, it is the responsibility of the SRO to find an ad hoc expert to comment on the application. While the ad hoc reviewer may be an expert and an advocate for the applicant, there is no opportunity for the committee to have the benefit of further questioning the applicant before or during the review meeting. \n\nThe issue of communication is very apparent by Institute panels that evaluate large multi-million dollar P01s, P60s' etc. The NIH and other agencies have discarded committee site visits to the grantee institutions to "save" funds and instead opt to slog through multiple large grant applications without any communication with the applicant. This type of review is a daunting task for even the best constructed SEPs. In my experience, the slight increase in the cost of the bringing reviewers on site was greatly offset by the ability of the reviewers to dialog with the applicants, see their facilities and examine the scientific environment and cohesiveness. This hands on approach provided the review team a much better picture of the proposed grant program, which allowed for more accurate priority scores. Certainly with today's technologies, at a minimum, telecommunication packages could be arranged as substitute site visits. \n\nHopefully any new peer review design will factor in the importance of communication.

August 6, 2008

I am no longer an active researcher, so perhaps I can give an example of bias that may be enlightening. My last NIH review had 4 reviews attached to it. One gave me raving reviews, two, ok reviews, and one trashed it completely! I would think that this clearly illustrates that interreviewer reliability is minimal. \n\nA less objective system needs to be devised. I particularly support the statement below, suggesting that those submitting the grants be given a hearing with regards to their proposal so that they are not being judged on their ability to write but rather on the science. However,this needs to be step 2 after a double blind reading of the proposals by all the committee members (which of course obviates MUCH shorter applications). In this way a small number of reviewers cannot defeat a proposal by presenting their negative interpretation of it to the rest of the committee. While we all like to believe we are non-biased, if we know enough to review a grant we are clearly competitors in the same field, and hence, by nature, competitive.

Follow The Scientist

icon-facebook icon-linkedin icon-twitter icon-vimeo icon-youtube
Advertisement

Stay Connected with The Scientist

  • icon-facebook The Scientist Magazine
  • icon-facebook The Scientist Careers
  • icon-facebook Neuroscience Research Techniques
  • icon-facebook Genetic Research Techniques
  • icon-facebook Cell Culture Techniques
  • icon-facebook Microbiology and Immunology
  • icon-facebook Cancer Research and Technology
  • icon-facebook Stem Cell and Regenerative Science
Advertisement
Life Technologies