Making Public Data Public

Computational scientists develop a system for spotting data overdue for public release, and end up getting hundreds of open-access datasets corrected.

ruth williams
| 4 min read

Register for free to listen to this article
Listen with Speechify
0:00
4:00
Share

WIKIMEDIA, MIGUEL ANDRADEA paper in PLOS Biology today (June 8) describes Wide-Open—an automated system that scans published papers for references to publically available datasets and determines whether those data are indeed available. The system, which identified hundreds of datasets overdue for public release in one particular functional genomics data repository, has garnered resounding support from researchers, open-science advocates, and database curators alike.

“[The system] is remarkably simple, very straightforward, and . . . very impactful,” says biological data analyst and open-science proponent Titus Brown of the University of California, Davis, who was not involved in the study. “It is a really great example of a simple idea that’s easy to implement that nobody else thought of.”

Advances in biological techniques and computational technologies mean it has never been easier for scientists to accumulate, store, and, in the interests of collective knowledge, share their data. Indeed, for many biologists, a normal course of events is to generate data, submit it to a centralized repository, and then make these data available to the public upon publication of the associated study.

But, as Maxim Grechkin and Bill Howe of the ...

Interested in reading more?

Become a Member of

The Scientist Logo
Receive full access to more than 35 years of archives, as well as TS Digest, digital editions of The Scientist, feature stories, and much more!
Already a member? Login Here

Keywords

Meet the Author

  • ruth williams

    Ruth Williams

    Ruth is a freelance journalist.
Share
May digest 2025 cover
May 2025, Issue 1

Study Confirms Safety of Genetically Modified T Cells

A long-term study of nearly 800 patients demonstrated a strong safety profile for T cells engineered with viral vectors.

View this Issue
Detecting Residual Cell Line-Derived DNA with Droplet Digital PCR

Detecting Residual Cell Line-Derived DNA with Droplet Digital PCR

Bio-Rad
How technology makes PCR instruments easier to use.

Making Real-Time PCR More Straightforward

Thermo Fisher Logo
Characterizing Immune Memory to COVID-19 Vaccination

Characterizing Immune Memory to COVID-19 Vaccination

10X Genomics
Optimize PCR assays with true linear temperature gradients

Applied Biosystems™ VeriFlex™ System: True Temperature Control for PCR Protocols

Thermo Fisher Logo

Products

The Scientist Placeholder Image

Biotium Launches New Phalloidin Conjugates with Extended F-actin Staining Stability for Greater Imaging Flexibility

Leica Microsystems Logo

Latest AI software simplifies image analysis and speeds up insights for scientists

BioSkryb Genomics Logo

BioSkryb Genomics and Tecan introduce a single-cell multiomics workflow for sequencing-ready libraries in under ten hours

iStock

Agilent BioTek Cytation C10 Confocal Imaging Reader

agilent technologies logo