Ethic guidelines drastically limit experiments on human subjects. Hence, the fundamental mechanisms of human diseases are mostly studied in vitro or in animal models. These are only substitutes for understanding human physiology and disease. Proving that a mechanism responsible for disease progression in a model system is also relevant to human diseases—not to mention then translating it into a new therapeutic—is a major bottleneck in biomedicine. In the end, only clinical interventions on human will bridge models and human disease.
One approach is to look for correlations. If you can show that patients with tumors expressing, for example, stem cell markers have a much worse prognosis than those without them, that would suggest that stem cells are involved in human disease progression. This line of thinking has long been popular in oncology because you need only access surgical specimens, some mRNA or protein marker, and a follow up of patients. And with the recent advent of efficient microarray screens, this approach has become all the rage, reducing the discovery of signatures, i.e. multi genes markers, to a nearly automatic procedure.
The signatures’ prognostic potential can then be tested instantly in genome-wide compendia of expression profiles for hundreds of human tumors, all available for free in the public domain. Besides stem cells markers, signatures linked to all sorts of biological mechanisms or states have been shown to be associated with human cancer outcome. Indeed, several new signatures are published every month in prominent journals.
But such correlations are not all that they seem. The accumulation of signatures with all sorts of biological meaning, but nearly identical prognostic values, already looked suspicious to us and others back in 2007. It seemed that every newly discovered signature was prognostic. We collected from the literature some signatures with as little connection to cancer as possible. We found, for example, a signature of the blood cells of Japanese patients who were told jokes after lunch, and a signature derived from the microarray analysis of the brains from mice that suffered social defeat. Both of these signatures were associated with breast cancer outcome by any statistical standards.
We then went back to published cancer signatures and found that 60 percent were no more prognostic than signatures made by picking up genes at random among the 21,000 human genes. The problem occurred with single gene markers, but became dramatic with multigenes signatures. A gene chosen at random already has roughly one in five chance of being prognostic; for signatures made of more than 100 genes, 90 percent are prognostic. How is this possible? We showed that in breast cancer the expression of a large fraction of the genome correlates with the proliferation rate, which is prognostic in this disease.
It took us four years and six rejections to get this work finally published in a computational biology journal (PLoS Comput Biol, 2011)—not the most efficient venue to reach the oncology community. Meanwhile, a steady stream of studies confounded by proliferation rates has appeared. This has to be said, one can no longer stay silent about the rather limited self-correction capability of the top tier publishing system (Cell, Nature Genetics, PNAS, etc.), which promoted these studies in the first place.
The oncogenomic-based literature has forgotten the pitfalls of non-specific effects and the value of negative controls. It is not enough to show that a signature is prognostic; biological conclusions may be drawn only if its prognostic value is specifically driven by the mechanism/state under investigation. Importantly, we question prognostic signatures as specific research tools, not as clinical guides: smoke does not drive fire, yet it is powerful indicator of when and where a fire is burning.
Vincent Detours is a researcher at the Université Libre de Bruxelles in Belgium.