This opinion was written in response to “Opinion: Peer Review Study Compromises Response to Gender Bias”
In January, Science Advances published a massive project analyzing the peer-review outcomes of 350,000 manuscripts from 145 journals that found no evidence of gender bias following manuscript submission. Just a month earlier, my colleagues and I published in mBio a similar, though smaller-scale, study that analyzed the peer-review outcomes from 108,000 manuscript submissions to 13 American Society for Microbiology (ASM) journals. Our study found a consistent trend for manuscripts submitted by women corresponding authors to receive more negative outcomes than those submitted by men. Both projects analyzed six years’ worth of submission data that are only available to journal publishers but came to different conclusions.
Estimating possible sources of gender inequalities in peer review and editorial processes at scholarly journals is a difficult endeavor for various reasons. There are serious obstacles to solid, cross-journal experimental studies testing causal hypotheses by manipulating information and contexts of manuscripts, authors, and referees. So performing retrospective studies is the only option, and this is also far from simple due to the lack of a data-sharing infrastructure between publishers and journals. While this makes any generalization of findings problematic and limited, I believe it is essential to study peer review with large-scale, cross-journal data, to avoid overemphasizing individual cases. This is what we have tried to do in our recent Science Advances article.
While we knew our findings could be controversial, I am surprised by the way Ada Hagan has misinterpreted our research and would like to comment on the three points on which she based her opinion.
The journal selection is not robust
The lack of randomization in journal selection is a weakness of our study, but we never claimed to have followed a randomized sampling strategy. Moreover, the size, distribution, and quality of our dataset is unprecedented, and the use of different statistical models on the same dataset increased the rigor and robustness of our analysis. Previous research of this type was only performed on single journals or a small cohort of similar journals, and never at such a cross-domain scale. It is also important to note that the decision to restrict our sample only to journals indexed by Web of Science was to ensure comparability and avoid adding further controls to our models (e.g., to account for potential differences in editorial and peer review standards). This meant that we excluded a small number of journals, but these did not affect the distribution of journals by research area, and therefore our dataset included a broad representation of scientific journals. I will also point out that small-scale studies claiming to have found unequivocal traces of gender inequalities were never accused of being based on a non-randomized, representative journal sample, despite clearly being limited to only one or a few journals under review.
Each manuscript submission is treated as a single unit
I do agree with Hagan that we could not reconstruct the fate of rejected manuscripts later resubmitted elsewhere to estimate whether women were delayed in the publication process by multiple rejections. However, this would require a dataset covering thousands of journals from multiple publishers—something impossible to reach. We did control for the round of reviews—i.e., whether reviewers and editors were more demanding in case of manuscripts by women—and found no significant negative effects. Moreover, I believe that the decision of starting from individual manuscripts—rather than aggregate gender groups, as many previous studies, including Hagan’s, have done—allowed us to control for any confounders on which we had data, while estimating the effect of authors’ gender in all steps of the peer review process.
Desk rejections are not evaluated
This is a good point. We had data on desk rejections only on a sub-sample of journals, due to the fact that some manuscript submission systems recorded this information while others did not, and we reported on this in the preprint version of the paper published in February 2020. The results suggested that manuscripts with a higher proportion of women among authors had a lower chance of being desk rejected in health/medicine and social science journals, while they had a higher chance of being desk rejected in physical science journals. We originally included this analysis in the manuscript, but reviewers suggested to remove it as the focus of our study was on peer review.
In conclusion, we did not claim to study all the sources of inequalities and bias that affect women in academia, and indeed, we have contextualized our findings in the conclusions to avoid any misinterpretation. Our goal was to find traces of bias against women in the way peer review treats manuscripts submitted in a sample of journals from various research areas. In the end, we found no evidence of such bias. We should, in my opinion, continue to do our best to publish rigorous studies—even if controversial. As scientists, at the end of the day we believe in the power of evidence.
Flaminio Squazzoni is professor of Sociology at the University of Milan, Italy, where he leads the BEHAVE Lab. From 2014 to 2018, he chaired a large-scale EU project on peer review (PEERE).