© BRYAN SATALINOOpen-access papers are more popular in the scholarly literature than they’ve ever been, and this trend shows no signs of abating, according to a study of hundreds of thousands of papers published in journals spanning disciplines from physics and astronomy to chemistry and social science. The sprawling study, published this week (August 2) in PeerJ, found that 28 percent of the total scholarly literature is open access, and in 2015 (the most recent year with data complete enough to analyze), 45 percent of papers were open access.
By analyzing the recently unveiled web extension Unpaywall, which points users in the direction of open-access [OA] versions of papers, the authors found that nearly half of a sample of 100,000 papers users searched for in June were open-access. Information scientist and coauthor Jason Priem, cofounder of Impactstory, the nonprofit that spun out the open-source data platform OADOI, which powers Unpaywall, spoke with The Scientist about the study and its implications.
The Scientist: What were the main takeaways from the study?
Jason Priem: We’re finding from this, encouragingly, that the absolute number of open-access articles continues to grow as does the literature itself. But perhaps more relevantly, the percentage of articles available free-to-read online is also growing. So we’re quite pleased to see that as open-access fans ourselves.
TS: Was there any one discipline that stood out in terms of the prevalence of open-access papers in its literature?
JP: The discipline with the highest percentage of open access was biomedical research, with about 58 percent. The second highest would be mathematics, which is largely due, we’re pretty sure, to the Arxiv preprint server. On the other side of things, the two most remiss disciplines when it comes to open access were chemistry, which was about 16 percent OA, and engineering and technology, which is a little better at about 18 percent.
TS: Why do you think biomedical science has so many open-access articles?
JP: It looks like it’s probably PubMed Central [PMC]. Again, I think there’s probably going to need to be more research in order to be formally sure about this. But we were able to look at the growth of PMC and we can kind of see the growth in green OA [also known as self archiving], so it certainly looks like PubMed Central, and particularly the NIH mandate to deposit things in PubMed Central, seems to be the biggest engine causing biomed to be proportionally more OA compared to some other fields.
TS: You and your coauthors also analyzed the citation of open-access articles in your data set. What does the fact that open-access papers were cited 18 percent more than the average citation rate tell us about these types of articles?
JP: I’m always a little cautious about making a naive equivalence between citations and quality. It tends to provoke a lot of arguments that get in the way of moving forward. We do think that citations is one very reasonable proxy for impact. If the work is not being cited, it’s hard to make a strong case for it having a lot of impact on the literature.
Our findings do support the citation advantage. We do see that open-access articles are more likely to be cited than their toll-access counterparts. Interestingly, gold OA [a publishing model where open-access article authors, their funders, or their institutions pay a fee to publish their work] is sort of the exception to this. Articles published in gold OA journals actually get fewer citations, and that seems to be accelerating over time.
But we do see a very robust advantage for hybrid articles, for green OA articles, and for a category we call bronze, which are just sort of a free-to-read-online in a journal where they’re not really publishing anything about the license.
TS: Is that a term that you guys coined—‘bronze OA?’
JP: We coined it. We looked at this category and said, ‘Well, this is a category doesn’t have a real name in the OA literature yet.’ Because there is this concept of delayed OA—so it’s open-access in a toll-access journal after some embargo period. But that doesn’t include some of the other stuff that we were finding. We were finding things that were in a toll-access journal immediately upon publication for sort of promotional reasons, or at least what appeared to be promotional reasons. It was a particularly newsworthy article or something. They wanted it to be free-to-read for a little while, and then eventually it could go back under a pay wall. We saw journals who had policies that had these weird windows where they would have an embargo period and then it would be OA, and then after four years, it would go back to being toll-access.
In the ballpark of half of the bronze OA articles are things that are published by journals that do publish 100 percent open-access content immediately. So they’re not delayed OA, they’re not anything like that. But they don’t publish any kind of a license. . . . Basically, 100 percent of the copyright restrictions on the article still apply. So if I wanted to copy this article and host it on another site or something like that, that would be illegal based on the license terms of this journal. So we weren’t sure what to call these.
What they all share in common is the article is free to read, which we say ‘yeah’ to. That’s great. We’re happy that it’s free to read. But it isn’t under any kind of a license that would tend to ensure a longer-term persistence for the freedom to read that article or the ability to mine it and reuse it in other ways beyond readership.
TS: Do your data suggest that open-access papers will become increasingly prevalent in the literature?
JP: The data suggest that it will. It’s a pretty compelling curve. It seems to be moving pretty steadily upward. If I was a gambling man, that’s what I’d be putting my money on.
TS: Will there ever be a time when everything that’s published will be open-access? Or is that a pipe dream?
JP: I absolutely expect, and indeed am counting on, there being a time when nearly everything is available open online. There’s always going to be situations where there’s a couple holdouts. That may never go away. So I could easily see there being 95 percent of the literature is free-to-read online. And I think that will happen, optimistically, in the next decade, more pessimistically, in the next two decades. But I think it’s probably hard to imagine that there’s not going to be some stubborn redoubt of toll-access somewhere.
The business model of, ‘I’ve got something you want, and I won’t let you look at it unless you pay me money,’ is a very old business model. It’s been going on for quite a long time. I suspect there will be people who continue to find a way to make money off of that. That said, I think that’s going to be such a tiny minority, simply because research funders are getting tired of funding research that’s needlessly hidden behind a pay wall.
TS: Did you and your coauthors look into the effects of geographic origin on open-access publishing?
JP: Nope. We haven’t done that yet. That’s #futureresearch. It’s a great question, and we’re looking forward to looking into that down the road.
TS: What do you think about SciHub?
JP: One thing that I really like about SciHub is it highlights for people both the unfortunate state of affairs right now, where you have to resort to using a pirate site to read papers, and it also shows people what the future could look like—a future where all papers are just universally open-access, and how powerful that could be. I think those are all great things about SciHub, and I think SciHub is helping to move a conversation forward. On the other hand, a really big problem with SciHub is that—at least based on the interpretation of the law that I have read—it’s not legal. It’s a pirate site. And, consequently, I think there are very serious question marks about its long-term sustainability.