Advertisement

Disputes Over Text-Mining

Computer programs that trawl research papers can reveal important large-scale patterns and facilitate further research, but publishers are wary.

By | March 21, 2013

FLICKR, ROBERT CUDMOREResearchers are increasingly keen to use computer programs that scour the text of thousands of scientific papers, a method known as text-mining, but publishers tend to block such programs. The resulting disagreements are coming to a head, reported Nature, with the European Union set to rule on the legality of text-mining, and researchers and publishers discussing the terms by which the method can be used.

“Data- and text-mining techniques . . . could hold the key to the next medical breakthrough, if only we freed them from their current legal tangle,” Neelie Kroes, vice-president of the European Commission, told a Brussels intellectual-property summit last September, according to Nature.

Indeed, text-mining of the scientific literature has already proven useful. For example, Raul Rodriguez-Esteban, a computational biologist at drug company Boehringer Ingelheim in Connecticut, told Nature that he used the method to search roughly 23,000 articles to identify hundreds of proteins that ameliorate multiple sclerosis in a mouse model. He then identified other proteins that interacted with them to find potential drug targets.

But it can take years to negotiate agreements with publishers to trawl their content, if permission is granted at all. Some publishers argue that programs downloading thousands of papers would strain their servers, and many fear it will lead to free distribution of their content.

The United Kingdom will later this year make text-mining for non-commercial purposes exempt from copyright, allowing scientists to mine any content they have paid for. The European Commission appears less amenable to such a measure, however. And in the United States, “fair use” rules may apply but the law is far from clear.

Meanwhile, CrossRef, a non-profit collaboration of publishers, is working on a system that will allow researchers to agree to text-mining terms and conditions by clicking a button on the publisher’s website. And the Copyright Clearance Center, which works with publishers on rights licensing, aims to collect the different publishers’ terms and conditions on one website to facilitate easy access for researchers and drug companies.

Advertisement

Add a Comment

Avatar of: You

You

Processing...
Processing...

Sign In with your LabX Media Group Passport to leave a comment

Not a member? Register Now!

LabX Media Group Passport Logo

Comments

March 22, 2013

Profit over progress? Surprising. Data-mining is single-handedly making Moore's Law possible. 

Avatar of: HStaley

HStaley

Posts: 6

March 22, 2013

Ease of access has become everything. If we can eliminate the systematic barriers, the achievements of science will be ten-fold.

Great, great article Dan.

Follow The Scientist

icon-facebook icon-linkedin icon-twitter icon-vimeo icon-youtube
Advertisement
Ingenuity Systems
Ingenuity Systems

Stay Connected with The Scientist

  • icon-facebook The Scientist Magazine
  • icon-facebook The Scientist Careers
  • icon-facebook Neuroscience Research Techniques
  • icon-facebook Genetic Research Techniques
  • icon-facebook Cell Culture Techniques
  • icon-facebook Microbiology and Immunology
  • icon-facebook Cancer Research and Technology
  • icon-facebook Stem Cell and Regenerative Science
Advertisement