The amount of mRNA expressed by a tumor seems to serve as a reliable indicator of disease progression, a new analysis of human tissue sample finds. The correlation, which researchers identified for 15 types of cancer, was enabled by a new statistical technique that helps make sense of massive sets of convoluted data.

Unlike research conducted on the homogenous tumors that cancer scientists can grow in a lab and use as experimental models, studying cancers that grow in humans can be a messy affair. Cancer cells, microbiota, and human immune cells can all exist in close proximity to one another, and researchers attempting to sequence the tumor cells typically choose between bulk sequencing, which yields difficult-to-interpret data, or single-cell sequencing, which quickly becomes costly in terms of both time and money. The new technique, however, can pull useful information out of bulk data generated from human cancer samples and provide a level of resolution typically only offered by sequencing cells one at a time, the researchers behind the work say.

See “Could Cancer’s Microbiome Help Diagnose and Treat the Disease?

The technique produces a measurement called tumor-specific total mRNA expression (TmS), which represents the ratio of tumor cells’ mRNA expression to that of the cells surrounding it, according to the study published yesterday (June 13) in Nature Biotechnology. Essentially, TmS is calculated by first estimating the total RNA expression of a given cell population, which is then divided into tumor and nontumor cells based on the presence of biomarkers. The total expression is then divided by the fraction of each kind of cell, resulting in a per-cell measurement of RNA expression. Finally, that number is divided by cell ploidy to make sure TmS represents cellular activity rather than the number of chromosomes present.

Lead study author Wenyi Wang, a biostatistician at the University of Texas MD Anderson Cancer Centerwhose work typically focuses on deconvoluting and modeling data says that the discovery of a technique to measure total tumor cell mRNA expression in bulk sequencing data was an accident. Her original goal was to measure tumors’ expression of specific genes, but she was unable to find consistency between RNA sequencing and DNA sequencing data. As she tried to figure out the root of the issue while revisiting the literature, she realized that, while it was known that total tumor cell mRNA expression was important, no one had established a “direct link with patient outcomes as well as other biological correlates in large cohorts,” she tells The Scientist.

It was when she was validating her technique on 6,590 human tumor samples spanning 15 cancer types that she made the “kind of accidental” discovery that mRNA expression across many cancers correlates with disease progression, she explains. In general, tumors from later-stage cancers yielded higher TmS ratios and, in some types of cancers, higher TmS values were associated with biomarkers of enhanced tumor metabolism or activity. Previous studies from other groups had identified a similar trend before, correlating cancer prognosis with mRNA expression, but those studies tended to be smaller, focused on specific genetic sequences, or were otherwise specific to one kind of cancer. In contrast, Wang says this is the first time that the correlation has been identified at this scale and through a methodology that is applicable across cancer models.

Subhamoy Dasgupta, a molecular biologist at Roswell Park Comprehensive Cancer Center who didn’t work on the study, says the new technique “gives a more comprehensive look inside the cancer mRNA in a much more lucid way.”

It is possible to obtain tumor cell mRNA expression levels using single-cell sequencing, Wang says, but doing so would be prohibitively expensive for a large sample size and difficult to do since “actual patient tumors come as-is”—meaning they can be difficult to work with, especially while isolating and sequencing specific individual cells.

“Based on my understanding of an overview of their method, I feel it’s a really good work,” says Dasgupta. He adds that the study was impressive for including data from an unprecedented number of tissue samples and “bringing it together in a very unbiased way and very robust way.”

The biological pathways underlying the connection between mRNA levels and cancer progression remain unclear. Dasgupta suggests that TmS could serve as a proxy measurement for a tumor’s metabolic state, in that a tumor would need to generate more mRNA to synthesize proteins in order to sustain itself. If that’s the case, he suggests that having that metabolic data on hand may impact one’s prognosis or treatment plan. Wangs says that she’s already begun working with clinicians to explore TmS-guided clinical trials in which TmS would be calculated for new cancer patients and used to help determine proper treatment.

Wang says that the correlation will likely apply to cancer types in addition to the 15 studied here. The study was limited by data availability, smaller sample sizes for some kinds of cancer, and the confounding effect of chemotherapy (especially in triple-negative breast cancer samples), all of which hindered the researchers’ ability to look for the same correlation in other cancers, she notes. Dasgupta says he suspects that solid tumor cancers will demonstrate a similar correlation, but that so-called liquid cancers like leukemia may differ.

Both Wang and Dasgupta say they’re optimistic that the bioinformatics behind calculating TmS—and especially the way the team was able to derive that measurement from bulk sequencing data—might result in other discoveries. The metric could be straightforwardly applied to already-gathered samples from large cancer studies, Wang notes, perhaps leading to clinically relevant findings.