Hypothesis-Free? No Such Thing

Even so-called "discovery-driven research"

By | May 1, 2008

Following a recent computational biology meeting, a group of us got together for dinner, during which the subject of our individual research projects came up. After I described my efforts to model signaling pathways, the young scientist next to me shrugged and said that models were of no use to him because he did "discovery-driven research". He then went on to state that discovery-driven research is hypothesis-free, and thus independent of the preexisting bias of traditional biology. I listened patiently, because I have heard this argument many times before.

I was too polite to point out that all biological research was hypothesis-driven, although the hypothesis might be implicit. Genomic sequencing projects might seem to lack a hypothesis, but the resulting data is exploited by hypothesizing specific evolutionary relationships between different genes.

The idea there are actually two distinct ways of conducting biological research was formally proposed several years ago in a Nature Biotechnology commentary (R. Aebersold et al., 18:359, 2000). The authors described "discovery science," like genome sequencing projects, as blindly cataloguing the elements of a system, disregarding any hypotheses on how it works. In contrast, they described "hypothesis-driven science" as being small-scale, narrowly focused, and using a limited range of technologies.

Although the authors' intent was to justify large-scale research as a valid way to approach biological problems (another frequent topic at after-meeting dinners), in my opinion, casting it as hypothesis-free did the emerging field of systems biology a great disservice. To imply that large-scale systems biology research can be productively conducted without a prior set of underlying hypotheses is nonsense. A good hypothesis is at the heart of the best science, regardless of scale.

We started our systems biology program almost eight years ago, and one of our first projects was to establish the relationship between specific cell signaling pathways and both gene and protein expression. We thought that important patterns would quickly become self-evident, but sorting through lists of thousands of genes and proteins quickly dissuaded us of that idea. We could see patterns, but they simply did not make any obvious sense. We mostly know the relationship between gene expression and subsequent protein levels, but looking at thousands of genes made it seem more complex, and overwhelmed our intuition.

To extract biological meaning from the data required a level of simplification. And this is where we needed a hypothesis. By postulating that specific classes of proteins were degraded at an accelerated rate, for example, we could create hypothetical patterns against which to compare our data. This allowed us to quickly look for both expected and unexpected relationships. After our initial, disappointing foray into "discovery science", we subsequently used specific hypotheses to guide our experimental designs. For example, by proposing that signaling pathways regulate the shedding of proteins from the cell surface, we were able to identify these proteins, relate them to specific signaling pathways, and discover that they are frequently released by cancer cells (Jacobs et al., J Proteome Res, 7:558, 2008).

Despite the importance of hypotheses in systems biology research, they are not always explicitly stated. As biologists, we are well trained in posing small, specific questions, but we have little familiarity with framing systems-level hypotheses. (Unlike small questions, systems-level hypotheses might take the form of postulating how the outputs from different signaling pathways are combined.) Likewise, our intuition regarding systems-level relationships in biological systems is difficult to translate into experimental design. This is why computational models are so central to systems biology research. Unlike humans, computers are very good at keeping track of complex relationships and predicting how low-level changes will alter higher-level functions. Computational models, however, must be built from a set of explicit, hypothesized relationships.

Finding meaningful relationships in complex datasets also requires starting with the appropriate data. A hypothesis usually takes the form of a mechanistic relationship between a specific cause and a consequent effect, and this will almost always depend on experimental context. There are some circumstances when data must be gathered in the absence of context or hypothesis to characterize a system, but it is unrealistic to expect such preliminary studies to lead to significant biological insights. For this, you need a hypothesis.

Systems biology might be the future of biology, but we still need hypotheses to take us where we want to go.

Steven Wiley is a Pacific Northwest National Laboratory Fellow and director of PNNL's Biomolecular Systems Initiative.


Avatar of: anonymous poster

anonymous poster

Posts: 1

May 2, 2008

Wiley should have drawn the distinction between "a priori hypotheses" and "retrospective hypotheses". Much research (genome projects) are often conducted without an a priori hypothesis, but can then be subjected to tests of various hypotheses that come out of the "fishing expedition". I argue that this approach is very fruitful, and I continually dismayed by granting agencies that reject grants such as these.\n\nConsider the example of finding all potential substrates for a given kinase. Is it better (more productive) to search through individual proteins that you hypothesise might be substrates, or to use the kinase as bait in a broad-scale yeast two hybrid assay?
Avatar of: STEVE MOUNT


Posts: 1

May 3, 2008

My comment was going to be a lot like the first, anonymous, comment. I certainly favor the scientific method, but I also recognize that unbiased hypothesis-free high-throughput data allows hypothesis-driven data mining. A complete genome sequence (for example) is independent of the hypothesis that led to its elucidation. If it is high-quality data, then it can be used to address a variety of hypotheses. All too often, data generated in order to address a specific narrow hypothesis is not as useful for addressing others as data generated without bias. Economics and astronomy provide models for how this sort of science is done.
Avatar of: anonymous poster

anonymous poster

Posts: 69

May 7, 2008

The topic as well as the comments are actually painful to read since it is generally accepted in methodological circles that the Baconian free experiment does not exist.Theory can be done without experiment but experiment cannot be done without theory. Thus the paper in Nature Biotechnology if cited correctly would be rather a puerile excercise at best.\nMethodological statements and debates must be in a higher plane and these statements are woefully inadequate. I omit my name merely to indicate that no offense but serious criticism is intended.

May 7, 2008

I have a hypothesis that sweeping generalizations ("there is no such thing as hypothesis-free science!") dont make good science. \nI have a further hypothesis that there are classes of scientific problems for which generating and testing hypotheses is a good strategy to advance understanding, and other classes where different strategies may be more productive. \nI suspect that any strategy can be shoe-horned into a pseudo-hypothesis-straitjacket - but at the expense of doing some violence to the spirit of the scientific method. \nWhat makes a 'good' hypothesis? an idea, in the form of a succinct conjecture, consistent with known relevant facts, about how some aspect of the universe works, that produces falsifiable predictions. \nSuccinctness removes extraneous details to reduce the search field and ensure that hypotheses can be distinguished by experimental testing of their consequences, but if few known facts exist to constrain the number of possible hypotheses, then the search space may still be too large. Techniques that generate large amounts of data and trawl for patterns may be a better bet to begin with. Is this a hypothesis-driven approach? Not in my book... the 'hypothesis' that a pattern exists, cant be falsified. \nThe opposite problem exists when possible theories are overconstrained by plentiful 'facts' to the point that attempts to accommodate them all leads to elaborate and complicated hypotheses whose consequences are mostly beyond the horizon of what is currently experimentally possible. Time then to tease out all the 'facts' and implicit assumptions and check their domains of validity carefully. Is this a hypothesis-driven approach? Again no ... 'there is a mistake somewhere' is not falsifiable.\nHypothesis-generation and testing is an important powerful method for doing science, but turning it into a mantra to cover everything we do in the name of science produces content-free 'dummy hypotheses' that detract from scientific credibility.
Avatar of: Ruth Rosin

Ruth Rosin

Posts: 117

May 7, 2008

It is my humble opinion that research in every new scientific field must inevitably begin without any hypotheses. Scientists then act like cooks who decides to throw this, that, and the other, together in a pot, cook it, and see what comes out.\n\nOnly after they gain sufficient information based on such "haphazard cooking", may scientists begin to discern repetitive patterns, which allow them to start formulate hypotheses.
Avatar of: David Hopp

David Hopp

Posts: 1

May 8, 2008

Those of you who had the time to take a history or philosophy class may recall that Francis Bacon, in Novum Organum, 1620 (this is not a misprint), advocated hypothesis-free investigation. I feel that present and past attempts at this lack an understanding of both language and the way the human mind deals with information. Nevertheless, Novum Organum - despite its daunting length and prolixity - does have some very useful insights, particularly The Four Idols of the Mind.
Avatar of: R K Nibbe

R K Nibbe

Posts: 1

May 9, 2008

I work in the area of proteomics. My present project involves studying changes in the proteome of late stage colon cancer. I conducted a big screen of clinical samples using a gel-based approach. Actually, 2 big screens. I identified many targets changing. Took me a lot of time and effort. Used a lot of cool technology too. Was this hypothesis-driven "research?" My committee didn't think so, but not a one disagreed it was a sensible way to begin. \n\nSo far the results have inspired a novel and hypothetical in-silico network analysis (work I'm ready to publish), but no one, including me, saw that coming at that start.\n\nIn our area I like to say we don't reject null hypotheses, we create them.
Avatar of: anonymous poster

anonymous poster

Posts: 1

May 9, 2008

In our peer review system the grant funds are presumably awarded based on the intellectual strength of the hypothesis and preliminary data that supports this hypothesis. Ultimately, it is hypotheses and data that decide if a study is published and replicated. \n\nThe type of "hypothesis free big science" costs a lot of money. If we're not going to judge these proposals based on their hypotheses, then how should we decide whether they are worthy to fund? Why fund this hypothesis free project but not the other. Who will decide and how? \n\nOne can not escape the feeling that advocates of hypothesis free research would like to receive hundreds of millions of dollars without the effort to come up with a hypothesis. \n\nLastly, the high throughput screens that are mentioned by the comments are a mere "techniques" or "methods" that are used to carry on an 'experiment'. Please do not confuse methods with the experiment itself!!!
Avatar of: Lukas Buehler

Lukas Buehler

Posts: 1

May 12, 2008

I very much appreciate this article, since the claim that discovery science is not based on any hypothesis has always bothered me. Large scale technologists have for a long time prided themselvses to be above a hypothesis driven science. Working as a consulant in DNA microarrays, I have always wondered why discovery is not considered a hypothesis. Finding patterns, finding genes that are expressed, finding relationships among those genes are all hypotheses. Even cataloguing the elements of a system is never truly blind, but systematic. Without a system in mind, a plan there can be no experiment.
Avatar of: anonymous poster

anonymous poster

Posts: 24

May 14, 2008

Data collection is descriptive, like natural history and as such is not science. It can be a useful precursor to science though. If you do some post hoc science with that data after its been collected then you will have to be testing a hypothesis. If not then you are just doing more data collating. Discovery-driven science is a roundabout way of saying 'we'll think of the hypothesis later, just give us the money and we'll do something useful with it'. Essentially 'its all good'. This approach to research is just plain lazy. If you want to collect some data to have a poke around, then presumably there was a reason why you wanted to collect that data in the first place. Err that was the hypothesis, and you just missed it.
Avatar of: Henk R. Braig

Henk R. Braig

Posts: 1

May 14, 2008

How are new species of animals, plants, microbes, parasites or pathogens discovered and described? Descriptive work is not hypothesis driven, but is it bad science? It is certainly published in low impact-factor journals. Should it be funded only at the margins? But without described organisms, it is difficult to do hypothesis-driven research.
Avatar of: tian xia

tian xia

Posts: 34

May 14, 2008

it is a jargon I never fully understand. I often heard of it and don't know exactly what they are talking about. It seems we already know something (to some people, they know everything), then we want to see if this and that works. If it is a tiny little experiment, is it a hypothesis? Hypothesis seems to be a rather BIG thing, otherwise why not use "I have an idea" that everybody understands. By the way, is that important?
Avatar of: Bart Janssen

Bart Janssen

Posts: 5

May 14, 2008

In the simplest form the scientific method takes observations and develops an hypothesis that may explain the observations, AND is testable by controlled experiment.\n\nThe expectation is that the experiment is carried out, new observations made and the hypothesis may or may not survive the test.\n\nBut the key step is that observations are made before any hypothesis is made. An hypothesis made without observations is non-science, or nonsense whichever you prefer.\n\nIn the 1600s the Royal Society was formed by scientists who made observations and developed hypotheses. But first they made observations. And that observational part of the process was called, and still is called, discovery science.\n\nDr Wiley seems to think discovery science was invented in the year 2000 or perhaps it was just the most recent reference.\n\nIt has been around a teensy bit longer.\n\nTo suggest it is not valid science is to deny several hundred years of scientific progress made in large part by the efforts of scientists who spent their lives gathering observations without ever attempting to formulate an hypothesis. Many would not have dreamed of attempting the arrogance of formulating an hypothesis without a complete set of observations.\n\nIn the last three decades of the last century hypothesis driven biology dominated, mostly because we did not have the tools to make observations easily. But in the last 10 years, like the members of the Royal Society 500 years ago, we've been blessed by the presence of some inspired tool makers who've allowed us to make observations that were simply unimaginable by most of us.\n\nAnd now we have a time where it is possible to spend a career making observations and leaving others to formulate hypotheses that explain those observations.\n\nIs Dr Wiley trying to suggest that spending a career making observations is not "real" science? Or that the people making hypotheses are somehow more important that those making observations?\n\nBy all means do hypothesis driven science, but be aware that for a long time science was not advanced by hypotheses but instead by observation and discovery. Both observation and hypothesis are critical to the scientific method, to belittle either is wrong.

Popular Now

  1. Thousands of Mutations Accumulate in the Human Brain Over a Lifetime
  2. Two Dozen House Republicans Do an About-Face on Tuition Tax
  3. Putative Gay Genes Identified, Questioned
    The Nutshell Putative Gay Genes Identified, Questioned

    A genomic interrogation of homosexuality turns up speculative links between genetic elements and sexual orientation, but researchers say the study is too small to be significant. 

  4. Can Young Stem Cells Make Older People Stronger?