Sets of guidelines for how studies should be conducted and reported abound in the life sciences, including ARRIVE, which addresses animal research; STRANGE, a newcomer that tackles animal behavior research specifically; and Cell Press’s STAR Methods. Such frameworks aim to increase the quality and transparency of research.
Last week, in a paper published in PNAS, a working group with members from eLife, Wiley, PLOS, the Center for Open Science, the University of Edinburgh, Nature Portfolio, Cell Press, and Science put forward a new framework that they say can be flexibly applied to a variety of types of life science research and reduce the burden on journals and authors relative to existing guidelines. For example, authors are asked to provide DOIs for preregistered study protocols or clinical trial registrations, if available; to provide full information on the types of cells or experimental animals used and where they were obtained; and to describe the number of times an experiment was replicated. The Scientist spoke with working group member Malcolm Macleod, a neurology professor at the University of Edinburgh who works on research improvement and integrity, about the new framework, dubbed Materials Design Analysis Reporting (MDAR).
The Scientist: How did this working group come about?
Malcolm Macleod: Essentially, what had happened was that a number of publishers, particularly Springer Nature, had sought to implement what became known as the Landis guidelines for transparency and reporting of laboratory research. They’d been published in 2012 [and Nature] had tried to implement these as part of their manuscript handling platform. And they found that they could, and in fact we were involved in a before-and-after study showing that what they really did improve the quality of their reporting. But it was so burdensome in terms of in-house editorial staff time that even at Nature, they were struggling to provide that across all of their papers.
There was a view which had emerged that that if Nature were finding it difficult, then this wasn’t something that other journals were going to find at all easy, given the relative position of Nature in that space. And so along with other journals, they had come up with the idea of trying to set out a version that was easier to implement . . . for journal editorial handling staff, and for manuscript peer reviewers, and for authors when preparing their manuscripts in a way that would allow them to get what they wanted, with less in the way of blood, sweat, and tears.
TS: How did you set out to lessen those burdens?
MM: We started off from the Center for Open Science TOP framework for reporting and essentially decided that we wanted to have a system which was a framework rather than a straightjacket—that’s to say, it wouldn’t dictate every component of every element that a journal must do, but rather create, if you like, a menu, a smorgasbord of items from which the journal could choose. And that allowed us to go into more depth for some of these items than would be the case if everything had to apply for all journals.
So, for instance, there are some of the items which relate to human subjects research, and some items which relate to cell culture–based research. Now, if you’re a journal that only carries papers in cell culture, you don’t need to worry about the human subjects fields. And if you only do human subjects research, you don’t have to worry about the cell culture fields. So it’s flexible in that way.
The second thing is that we were keen that this shouldn’t be a system where you either passed or failed, if you like—that a journal either had to have a standard that was state-of-the-art perfect or no standard at all, but rather across each of them to have a staged and phased approach that would allow journals to say, well, this is where we think we and our research community are just now, and so this is the level of the standard that we are going to start at. And then, depending on how things change over time, then we may step up to the next level in the standard. Or if we find that nobody’s getting anywhere with it, then maybe we can step back down a level, if you see what I mean. . . . It sets out minimum requirements and then best practice recommendations.
Those minimum requirements are around about accessibility of where the data—the material, the codes, whatever—are, identification with unique and unambiguous identifiers, and then a minimum description characterization.
And then, finally, by seeking to have something which could be implemented across journals publishing work in the life sciences, rather than each journal have their own slightly different version, we wanted to make things easier for authors. And they would know that if they prepared a manuscript according to this framework as applied in their field, then that would be what the journals were asking them for. [They wouldn’t] have to keep thinking about what their target journal was, and how they needed to change things or do things differently, according to the journal.
TS: Do you think that this might be something, because it has these tiers, that might be used to sort of rate journals or specific papers, kind of like impact ratings?
MM: I was one of the only two people involved in the development [of MDAR] that didn’t have an affiliation to a journal or to a publishing house. For me, as an outsider . . . [if] I’m a researcher, I’ve done a bit of work, I’ve written it all up, I’ve got a manuscript, I can put that manuscript on bioRxiv, and it’s indexed and it sits there, and people will find it. Where does the added value come from journal publication? I guess there’s three bits of that. One is in terms of dissemination. One is in terms of those lovely typefaces and layouts and all of that. But third is the idea that if my work gets into journal X, then that says something about the quality of my work.
And so one could imagine that at a time when scientific publishing . . . has appropriately been criticized for fetishizing the impact factor and the H-index and the like, when we and others have shown that that often bears a very limited relationship to the underlying quality of the research that’s carried—I think you’re right, that a journal that could show that it had adopted this framework and that, importantly, the work that it published met the requirements of the reporting framework . . . would see its reputation rise. It is an opportunity, as you suggest, for journals to try and show what they’re good at and also, then, for others to make inferences on the quality of the work carried in different journals.
TS: What do you hope that the impact on science will be more broadly of having this framework that’s easier to implement?
MM: I’ve been involved in this space for ten or fifteen years now and have done a couple of evaluations both of the ARRIVE guidelines and of the Nature checklist, and was involved in both this and the revision to the ARRIVE guidelines for in vivo research. And it’s clear that there is no single magic bullet. I do not think, for the avoidance of doubt, that this is a single magic bullet that’s going to change the world. I think that incrementally, we should be able gradually to improve the quality of published research. And I hope that this will make a contribution to that by increasing understanding in journals and with authors . . . of what represents best practice research reporting, but also on why those things are important. There’s an explanation and elaboration document published as supplementary materials alongside this article which goes into that in some detail.
I suspect that quite a lot of what we see in the in vitro research field, if you knew the details of exactly what was done, would be substantially less convincing than it appears on the page.
Now don’t get me wrong, I think progress is being made. And in fact, for animal research—things like the reporting of blinding, the reporting of randomization—as far as we can tell, is probably increasing in absolute terms by about two percent per year over the last decade. Now that’s good, but at that rate of progress, I’m going to be long retired before we get up to the sort of levels that I would like to see. So anything that can help speed that along and provide more impetus to those processes—that’s to say, to facilitate and enable and strengthen a strand of improvement and activity that’s already happening—would be a good thing.
One of the nice things about the MDAR framework is that because it provides a fairly standardized framework across journals, then it makes it more feasible for third parties and for startup companies, for instance, to develop tools that might assist in the editorial handling process. For instance, natural language processing or machine learning tools . . . would then highlight areas where [they] thought that manuscript was strong, areas where [they] saw that manuscript was weak, and could then surface those, if you like, machine-derived opinions to the peer reviewers and the editorial staff so that they can then check more easily how that particular manuscript was performing against the framework.
TS: You mentioned reporting of the blinding and randomization in animal experiments. Are there other areas that come to mind that are particularly in need of improvement in terms of reporting? And if so, why are those areas important?
MM: There’s two directions to go in here. One is, after we fix randomization and blinding, what else should we fix in animal experiments? And I think the two things for me that would be most important would be, first of all, that people reported their sample size calculations so we know why their experiment was the size that it was; and secondly, that people had articulated what they were going to do in advance in a protocol for the study that was available in the public domain, and was dated before the experiment began. There’s this beautifully named, questionable research practice called HARKing, which you may have heard of, which is Hypothesizing After Results are Known. So, you’re testing a hypothesis, you don’t quite get the right answer. But then you convince yourself that actually, you were testing something else. And hey, if we test something else, p’s less than 0.05. And you get around that by getting people to assert in advance what their hypothesis is, and what they’re going to consider to be winning, and what they’re going to consider to be success. So that’s going in one direction.
The second direction, which I think is probably more important, is in extending the application of concerns about, for instance, randomization and blinding to the in vitro research space. I mean, in epistemological terms, measuring outcome in response, for instance, to a drug in an animal is just the same epistemological process as measuring in a collection of cells. There’s no reason why randomization and blinding would be important in animals, but not important in cell culture. And yet, when we look at the level of reporting of blinding in cell cultures: perhaps two or three percent; randomization in cell culture: probably about the same order of magnitude, two or three percent.
So at a time when much of our research is moving from whole-animal research to nonanimal alternatives, and to in vitro research, I think it’s critically important that we try and begin to work out where the risks of bias are in in vitro research. Of course, they won’t be the same as they are in in vivo research. Some will be common across the two, but things like pseudoreplication in cell culture experiments, things like, how many times did you actually do the experiment, and how many of them make it into your paper? It’s much easier and cheaper to repeat a cell culture experiment until you get the right answer than it is to repeat a six-month animal experiment until you get the right answer. And I suspect that quite a lot of what we see in the in vitro research field, if you knew the details of exactly what was done, would be substantially less convincing than it appears on the page.
TS: Have any journals signed on so far and said that ‘Yes, we are going to use this framework’?
MM: I know for sure that several of the leading journals involved . . . are committed to implementing this, either in whole or in part.
Editor’s note: This interview was edited for brevity.