Microparadigms in cell biology?

Textual model questions efficiency of gaining scientific knowledge

| 3 min read

Register for free to listen to this article
Listen with Speechify
0:00
3:00
Share
Published scientific statements, whether they are later proven true or false, have a profound effect on subsequent interpretations by researchers and on the probability that they will eventually come to a correct conclusion about a scientific question, a statistical analysis of protein interaction literature reveals. The findings, published this week in the Proceedings of the National Academy of Sciences (PNAS), suggest that the way these "microparadigms" bias future interpretations may actually slow down the process of gaining scientific truth.The paper "sets up a very sophisticated model" to answer large-scale questions about how scientific knowledge is produced that "no one has previously been able to measure," said Neil Smalheiser at the University of Illinois at Chicago, who did not participate in this study.These findings hint that the "current way we produce and interpret results is not optimal" for scientists to ultimately converge upon the correct result, according to first author Andrey Rzhetsky of Columbia University. "The model suggests that dependence between statements is too strong."Rzhetsky's team assessed 1.5 million unique statements about protein interactions from 150,000 full text articles in 78 journals (GENEWAYS 6.0). Using a binary system (eg. protein A either interacts or does not interact with protein B), they chronologically ordered statements about each pair of proteins to construct chains of reasoning over time.The group then simulated different ways scientists might approach published findings, and assessed the probability that each scenario would lead to the correct answer at any given step of the chain. If scientists trust nobody, for example, and ignore all previous literature, the probability of publishing the correct result remains constant. At other extremes, scientists could be super-conformists (usually agreeing with the majority opinion about a given protein-protein relationship) or super-anti-conformists (usually agreeing with the minority opinion).The authors searched their real world data set for these hypothetical patterns; while all five were present, the pattern of mild skepticism was most common.When they measured the momentum, or strength of influence, of published statements on future interpretations, they found that scientists give their own data at least 10 fold greater weight than others' findings, but are still heavily influenced by previous results and particularly the majority opinion -- revealing a tendency for conformism. What's more, the authors discovered that a strikingly large proportion of results (95%) are positive -- reporting presence rather than absence of an interaction.According to the authors' stochastic analysis, this predominance of positive results can only be explained by two extremes: A very low rate of experimental errors or exceptionally invalid experiments. So scientists are either perpetuating truth or perpetuating errors, Rzhetsky said.The authors also found that the momentums of actual published statements are too high to optimize the probability of coming to the right result at the end of a given chain. This phenomenon could be explained by the premium placed on new data in science publishing, said Gully Burns at the University of Southern California, who did not participate in this study. "You can't really get things published simply reproducing other people's results," he said. To produce correct scientific knowledge more efficiently, Rzhetsky suggested "independent benchmarking" by an institution that would periodically verify a sampling of the literature.This paper demonstrates the utility of similar exercises for large-scale data mining, even beyond protein interactions, Burns told The Scientist. Researchers have performed similar data mining only in sequence databases, added Smalheiser. "It's probably as sophisticated an example of text mining as there is so far, [and] more direct and more sensitive than citation analysis."Still, according to Burns, it will be important to "parametrize more details of individual experiments" in a future model, for example by accounting for the section of the paper in which a statement is found or the animal model or cell type used to derive it.Rzhetsky said this work is part of a larger effort to sort and evaluate millions of facts from the literature to create an overarching model of cellular interactions. "A huge amount of information is already published and locked in literature," he said. "We're trying to get that information out."Ishani Ganguli iganguli@the-scientist.comLinks within this articleA. Rzhetsky et al., "Microparadigms: Chains of collective reasoning in publications about molecular interactions," PNAS, March 14, 2006. http://www.pnas.org/cgi/doi/10.1073/pnas.0600591103R. Finn, "Program uncovers hidden connections in the literature," The Scientist, May 11, 1998. http://www.the-scientist.com/article/display/18032/Neil Smalheiser http://www.psych.uic.edu/faculty/smalheiser.htmAndrey Rzhetsky http://genome6.cpmc.columbia.edu/andrey/Gully Burns http://www-rcf.usc.edu/~gully/
Interested in reading more?

Become a Member of

The Scientist Logo
Receive full access to more than 35 years of archives, as well as TS Digest, digital editions of The Scientist, feature stories, and much more!
Already a member? Login Here

Meet the Author

  • Ishani Ganguli

    This person does not yet have a bio.
Share
3D illustration of a gold lipid nanoparticle with pink nucleic acid inside of it. Purple and teal spikes stick out from the lipid bilayer representing polyethylene glycol.
February 2025, Issue 1

A Nanoparticle Delivery System for Gene Therapy

A reimagined lipid vehicle for nucleic acids could overcome the limitations of current vectors.

View this Issue
Considerations for Cell-Based Assays in Immuno-Oncology Research

Considerations for Cell-Based Assays in Immuno-Oncology Research

Lonza
An illustration of animal and tree silhouettes.

From Water Bears to Grizzly Bears: Unusual Animal Models

Taconic Biosciences
Sex Differences in Neurological Research

Sex Differences in Neurological Research

bit.bio logo
New Frontiers in Vaccine Development

New Frontiers in Vaccine Development

Sino

Products

Tecan Logo

Tecan introduces Veya: bringing digital, scalable automation to labs worldwide

Explore a Concise Guide to Optimizing Viral Transduction

A Visual Guide to Lentiviral Gene Delivery

Takara Bio
Inventia Life Science

Inventia Life Science Launches RASTRUM™ Allegro to Revolutionize High-Throughput 3D Cell Culture for Drug Discovery and Disease Research

An illustration of differently shaped viruses.

Detecting Novel Viruses Using a Comprehensive Enrichment Panel

Twist Bio