WIKIMEDIA, MIGUEL ANDRADEA paper in PLOS Biology today (June 8) describes Wide-Open—an automated system that scans published papers for references to publically available datasets and determines whether those data are indeed available. The system, which identified hundreds of datasets overdue for public release in one particular functional genomics data repository, has garnered resounding support from researchers, open-science advocates, and database curators alike.
“[The system] is remarkably simple, very straightforward, and . . . very impactful,” says biological data analyst and open-science proponent Titus Brown of the University of California, Davis, who was not involved in the study. “It is a really great example of a simple idea that’s easy to implement that nobody else thought of.”
Advances in biological techniques and computational technologies mean it has never been easier for scientists to accumulate, store, and, in the interests of collective knowledge, share their data. Indeed, for many biologists, a normal course of events is to generate data, submit it to a centralized repository, and then make these data available to the public upon publication of the associated study.
But, as Maxim Grechkin and Bill Howe of the ...