Last week, Jean-François Bonnefon, a behavioral scientist at the French Centre National de la Recherche Scientifique, tweeted that a scientific manuscript he submitted to a journal had been rejected by a bot. The program had flagged his paper for plagiarism, highlighting the methods, references, and authors’ affiliations. “It would have taken 2 [minutes] for a human to realize the bot was acting up,” Bonnefon wrote in one of his tweets. “But there is obviously no human in the loop here.”
In a massive Twitter thread that followed, several other academics noted having similar experiences.
“I found [Bonnefon’s] experience quite disconcerting,” Bernd Pulverer, chief editor of The EMBO Journal, writes in an email to The Scientist. “Despite all the AI hype, we are miles from automating such a process.” Plagiarism is a complex issue, he adds, and although tools to identify text duplication are an invaluable resource for routine screening, they should not be used in lieu of a human reviewer.
Kim Barrett, a physiologist at the University of California, San Diego, and the editor-in-chief of The Journal of Physiology, tells The Scientist that she was also surprised to hear that a journal rejected a paper based solely on the results of an automated plagiarism screen. “I think that the value in a given journal really lies in the quality of the peer review that it’s able to offer to authors—so the idea that this is a process that could be in some ways automated is kind of anathema to me.”
Travis Gerke, a cancer epidemiologist at the Moffit Cancer Center in Florida, says that he recently had a similar experience, in which a paper he submitted to one of Springer Nature’s journals came back with a bot-generated plagiarism report that primarily flagged the author list, citations, and standard language about patient consent. “It’s odd that they sent us an absurd report suggesting that we’ve copied other works, and that we have to explain why it is that we haven’t plagiarized,” Gerke says. “That’s not a good system.”
Unlike Bonnefon’s paper, Gerke’s article was not rejected and is still under review. Bonnefon did not disclose the name or publisher of the journal where he submitted his paper, although in the Twitter thread, Wiley said it would like to look into the issue. According to Tom Griffin, Wiley’s director of Global Communications and Media, the publisher had posted this message in a response to a now-deleted tweet that named Wiley as the publisher of the journal. “We had hoped to dig further into the researcher’s Twitter comment but unfortunately did not receive a response with the details needed to investigate,” Griffin writes in an emailed statement to The Scientist. “Wiley offers plagiarism software to editors across our portfolio to help detect overlap within the scientific literature. Results provided by the software are for guidance purposes and should be combined with a review of the paper by an editor to ensure accuracy when making a decision." (Bonnefon declined an interview.)
Many journals use simple text duplication detection software to identify plagiarism, Pulverer says. But these techniques have some limitations—for example, they are unable to capture the plagiarism of ideas, rehashing findings without attribution, or the use of figures or data without permission.
“At Springer Nature papers are first reviewed by a person, then checked using technology, and then checked by a person again,” Susie Winter, the director of communications at Springer Nature, writes in an email to The Scientist. “All decisions at Springer Nature are editorially led. While tools can provide excellent support for the peer review process they do not make a decision—whether a paper is accepted or rejected is the decision and responsibility of the handling editor.”
Several large publishers, such as Elsevier and Springer Nature, have started to test more complex, artificial intelligence tools to help support the peer review process, such as by identifying statistical issues or summarizing the paper by pinpointing its main statements. “These will turn out to be useful editorial tools,” Pulverer writes. “But [they] most certainly should not replace an informed expert editorial assessment, let alone expert peer review.”