It’s the question on every cancer patient’s mind: How long have I got? Genomicist Michael Snyder wishes he had answers.
For now, all physicians can do is lump patients with similar cancers into large groups and guess that they’ll have the same drug responses or prognoses as others in the group. But their methods of assigning people to these groups are coarse and imperfect, and often based on data collected by human eyeballs.
“When pathologists read images, only sixty percent of the time do they agree,” says Snyder, director of the Center for Genomics and Personalized Medicine at Stanford University. In 2013, he and then–graduate student Kun-Hsing Yu wondered if artificial intelligence could provide more-accurate predictions.
Yu fed histology images into a machine learning algorithm, along with pathologist-determined diagnoses, training it to distinguish lung cancer from normal tissue, and two different types of lung cancer from each other. Then he fed in survival data for those slides, letting the system learn how that information correlated with the images. Finally, he added in new slides that the model hadn’t seen before, and asked the all-important longevity question.
When pathologists read images, only sixty percent of the time do they agree.—Michael Snyder, Stanford University
The computer could predict who would live for shorter or longer than average survival times for those particular cancers—something pathologists struggle to do.1 “It worked surprisingly well,” says Yu, now an instructor at Harvard Medical School.
But Snyder and Yu thought they could do more. Snyder’s lab works on -omics, too, so they decided to offer the computer not just the slides, but also tumor transcriptomes. With these data combined, the model predicted patient survival even better than images or transcriptomes alone, with more than 80 percent accuracy.2 Today, pathologists normally make survival predictions based on visual evaluations of tissue micrographs, from which they assess a tumor’s stage—its size and extent—and grade, the likelihood that it will grow and spread further. But pathologists don’t always agree, and tumor grade doesn’t always predict survival accurately.
Snyder and Yu aren’t the only researchers who are recognizing the power of AI to analyze cancer-related datasets of images, of -omes, and most recently, of both combined. Although these tools have a long way to go before they reach the clinic, AI approaches stand to yield a precise diagnosis quickly, predict which treatments will work best for which patients, and even forecast survival.
For now, some of those applications are still “science fiction,” says Andrea Sottoriva, a computational biologist at London’s Institute of Cancer Research who’s working on AI to predict cancer evolution and choose the right drugs to treat a given tumor. “We aim to change that.”
INPUT: Images, OUTPUT: Diagnosis
Finding and treating cancer before it progresses too far can be key to increased survival. When it comes to cervical cancer, for example, early detection leads to five-year survival rates of more than 90 percent. Doctors can fry, freeze, or excise precancerous cells in the top four millimeters of the cervix’s transformation zone, a ring of tissue surrounding the cervix where cancer most often arises. Once the cancer metastasizes, however, survival rates drop to 56 percent or lower over five years.
Early treatment is commonplace in developed nations, where women get regular Pap smears to check for abnormal cervical cells and tests for the human papillomavirus that causes the cancer. But in the developing world, such screenings are rare. There is a cheaper test—health care workers coat a woman’s cervix in acetic acid, looking for telltale white areas that could indicate cancer—but “this technique is so inaccurate,” says medical epidemiologist Mark Schiffman of the National Cancer Institute. As a result, some healthy women undergo treatment while others might have their precancerous cells missed, leading to cancer that requires more-radical treatments, such as chemotherapy, radiation, or hysterectomy.
Schiffman and other groups have been trying to find a way to make acetic acid screening more accurate—for example, by imaging with spectra other than white light. Schiffman’s team had accumulated thousands of cervix pictures from diverse sources in the US and Costa Rica, including photos taken by health-care professionals with a magnifying camera called a colposcope or with a cellphone. But he was about to give up. “We couldn’t make it be really as sensitive or as accurate or as reproducible as the other [tests].”
How AI Takes On Cancer
Scientists have been using two main forms of clinical data to predict cancer outcomes: images (either photographs, as in the case of skin cancer, or pathology slides) and -omes of various sorts. Applying ever-more sophisticated machine learning approaches to these datasets can yield accurate diagnoses and prognoses, and even infer how tumors evolve (yellow arrows). Now, scientists are finding that images can predict -omics (blue arrows). Combining the two data sources gives researchers even better predictions of how long a cancer patient will live (thick purple arrows). The ultimate goal of these algorithms, currently under development in basic biology labs, is to help doctors select treatments and forecast survival.
THE SCIENTIST STAFF
Then, near the end of 2017, a nonprofit associated with the Bill & Melinda Gates Foundation called Global Good reached out. The organization wanted to try machine learning on Schiffman’s image collection, to see if a computer could provide diagnoses when physicians could not.
So Schiffman teamed up with Global Good and other collaborators to use a particular kind of machine learning, called a convolutional neural network, to analyze the cervix images. The goal of the algorithm was to identify features in the images—for example, how similar or dissimilar side-by-side pixels tend to be—that help it get the right diagnosis. At the start, its accuracy was no better than chance. As it analyzed more and more images, it weighed those features to help it find the answer. “It’s a process of getting hotter, hotter, colder, colder, oh yeah, hotter, hotter . . . until it gets as close as it possibly can,” explains Schiffman.
The team started with cervix images collected over seven years in Costa Rica from more than 9,000 women. Schiffman had also amassed data from more-accurate screening tests in these women, along with 18 years’ worth of follow up information on precancer or cancer diagnoses. The researchers used 70 percent of the complete dataset to train the model, then tested its performance on the images only from the remaining 30 percent. Schiffman couldn’t believe the results: machine learning distinguished between healthy tissue, precancer, and cancer with 91 percent on a standard measure of the accuracy of machine learning predictions. A human visual inspection, in contrast, only scored 69 percent.3 “I’ve never seen anything this accurate,” says Schiffman. He was sure there was some mistake.
The group checked its work and asked collaborators at the National Library of Medicine to independently verify the technique. There was no error: the machine really was that good at identifying precancer and cancer. Armed with this new tool, Schiffman hopes to develop a low-cost screening test for cervical cancer coupling a cell phone–type camera with machine-based image analysis. First, he wants to train his algorithm on tens of thousands of cell-phone cervix images from all over the world.I’ve never seen anything this accurate.
I’ve never seen anything this accurate.—Mark Schiffman, National Cancer Institute
He’s not the only one eyeing smartphones for cancer diagnosis. Skin lesions—which might be cancerous or benign—are right on the surface, and anybody can snap a shot. Researchers at Stanford University built a database of almost 130,000 photographs of skin lesions and used it to train a convolutional neural network to distinguish between benign bumps and three different kinds of malignant lesions, with at least 91 percent accuracy. The algorithm outperformed the majority of 21 dermatologists asked to assess the same pictures.4
A major challenge to creating predictive models of cancer is acquiring enough high-quality data. When the Stanford team compiled images of skin cancer from Stanford Medical School and from the internet, the angles, zooms, and lighting all varied. The researchers had to translate labels from a variety of languages, then work with dermatologists to correctly classify the lesions into more than 2,000 diseases categories.
And, of course, most cancers require more than a smartphone camera to see what’s going on. Observing individual cells in tumors requires microscopy. Scientists would also like to incorporate as much information as possible about a person’s clinical treatments and responses, plus molecular data such as genomes, but that too can be hard to come by, says Yu. “Rarely will we find a patient with all the data we want.”
INPUT: Images + -Omes, OUTPUT: Survival
As Snyder and Yu have found, -omics data, when available, can provide information about the molecular pathways involved in a given cancer that may help identify cancer type, survival, or likely response to treatment. In their initial, image-based studies, the researchers had 2,186 lung tissue slides, disease classifications from human pathologists, and patient survival times. The researchers used a computer algorithm to extract from those images nearly 10,000 features, such as cell shape or size, which they used to train several machine learning algorithms.
One approach that worked well is called Random Forest. It generates hundreds of possible decision trees; then those “trees” vote on the answer, and the majority rules. This algorithm was more than 75 percent accurate in distinguishing between healthy tissue and the two cancer types, and it could predict who fell into the high- or low-survival group with greater accuracy than models based solely on the cancer’s stage.1 “This is something that goes beyond the current pathological diagnosis,” says Yu.
In their follow-up study, the researchers ran their trained image analysis algorithm on histopathology slides from 538 people with lung cancer, then added transcriptomes and proteomes from those same patients, and asked the “random forest” to vote on the grade of their cancers. The expression levels of 15 genes predicted cancer grade with 80 percent accuracy. These genes turned out to be involved in processes such as DNA replication, cell cycle regulation, and p53 signaling—all known to play roles in cancer biology. The team also identified 15 proteins—not the ones encoded by the 15 genes—involved in cell development and cancer signaling that predicted grade with 81 percent accuracy. While the researchers didn’t compare this to human performance, one study of pathologists found 79 percent agreement on lung adenocarcinoma grading5—suggesting the machine and humans were equally accurate. But the machine was going further, apparently homing in on the specific gene-expression factors driving a cancer’s progression.
Finally, the researchers asked the computer to predict survival based on gene expression, cancer grade, and patient age. With all those data, the model achieved greater than 80 percent accuracy, correctly sorting cases into long-term and short-term survivors better than human pathologists, transcriptomes, or images alone.2
Inspired by Snyder and Yu’s work, Aristotelis Tsirigos and colleagues at New York University School of Medicine also sought to link images to genetics in lung cancer, using 1,634 slides of healthy or cancerous lung tissue. Based on images alone, their convolutional neural network was able to distinguish adenocarcinoma from squamous cell carcinoma with about 97 percent accuracy. Then, the team fed the algorithm data on the 10 most commonly mutated genes in lung adenocarcinoma, and the computer learned to predict the presence of six of those mutations from the pathology slides with accuracy ranging from 73 to 86 percent.6 “It works quite well,” commented Sottoriva, who was not involved in the work. “As a start, it’s quite exciting.”
Of course, doctors and scientists don’t need to identify mutations via imaging; other tests are more straightforward and more accurate, with genetic sequencing providing a nearly perfect readout of the cancer’s genome. This study, explains Tsirigos, serves to demonstrate that genetics and image features are related in predictable ways. Now, he’s working to combine histopathology and molecular information to predict patient outcomes, as Yu and Snyder’s group did. These kinds of methods should work for any cancer type, says Tsirigos, as long as researchers have the right data to input.
INPUT: -Omes, OUTPUT: Tumor Evolution
-Omics data are also useful on their own, even without images. For example, Sottoriva and colleagues are using genomics to understand tumor evolution. One tumor is typically made up of multiple cell lineages all derived from the same original cancer cell. To effectively treat cancer, it’s important to understand this heterogeneity and the manner in which a tumor evolved. If a treatment works on only a portion of a tumor, the cancer will come back. “It’s a real matter of life and death,” says Guido Sanguinetti, a computer scientist at the University of Edinburgh and a collaborator on the tumor evolution studies.
By sampling multiple parts of an individual tumor, researchers can infer what evolutionary paths the cancer took; it’s akin to sampling modern human genomes to trace various populations back to ancestral groups. Tumors from different patients, even with the same kind of cancer, tend to have wildly different evolutionary trees. Sanguinetti, Sottoriva, and colleagues think that if they can find common pathways that cancer tends to follow, oncologists could use that information to categorize people who are likely to have similar disease progression, or to respond similarly to drugs.
To find those common evolutionary trees, the researchers used a form of machine learning called transfer learning. The algorithm looks at all the trees from patients’ genomes simultaneously, sharing information between them to find a solution compatible with the whole group, explains Sanguinetti. They called their tool REVOLVER, for Repeated Evolution in cancer. As a first test, they invented fictional tumor evolutionary trees. When they fed REVOLVER genomics data based on those made-up trees, it did spit out a phylogeny that matched the invented trees.
To validate the tool in a well-known form of cancer evolution, the researchers turned to the transition to malignancy in colorectal cancer. This happens as a benign adenoma accumulates mutations in known driver genes: for example, in APC, then KRAS, then PIK3CA. The researchers fed REVOLVER a set of genomes from nine real benign adenomas and 10 malignant carcinomas. Sure enough, the model drew phylogenetic trees that matched the adenoma-to-carcinoma transition.
The group then analyzed tumor samples for which the evolution was less well understood. In genomes from 99 people with non-small-cell lung cancer, REVOLVER identified 10 potential clusters of patients, based on the sequence of mutations tumors accumulated. People in some of those clusters lived for less than 150 days, while those placed in other clusters survived much longer, suggesting the categories have prognostic value. Similarly, REVOLVER found six clusters among 50 breast cancer tumors with varying levels of survival between clusters.7 “We didn’t expect to find groups, really,” says Sottoriva. “These results tell us that evolution in cancer can be quite predictable.”
Medicine runs on those kinds of predictable patterns, says Sottoriva. And AI is a powerful tool to help identify patterns that are clinically relevant. Moreover, by selectively eliminating certain pieces of data from the model’s input and seeing if its accuracy drops, bioinformaticians are starting to figure out what features the computers are using to distinguish those patterns, says Tsirigos.
Current AI applications for cancer research are just the beginning. Future algorithms may incorporate not only -omes and images, but other data about treatment outcomes, progression, and anything else scientists can get their hands on.
“At the end of the day,” says Snyder, “when dealing with complicated diseases like cancer, you want every bit of information.”
Amber Dance is a freelance science journalist living in the Los Angeles area.
- K.H. Yu et al., “Predicting non-small cell lung cancer prognosis by fully automated microscopic pathology image features,” Nat Commun, 7:12474, 2016.
- K.H. Yu et al., “Association of omics features with histopathology patterns in lung adenocarcinoma,” Cell Syst, 5:620–27.e3, 2017.
- L. Hu et al., “An observational study of deep learning and automated evaluation of cervical images for cancer screening,” J Natl Cancer Inst, doi:10.1093/jnci/djy225, 2019.
- A. Esteva et al., “Dermatologist-level classification of skin cancer with deep neural networks,” Nature, 542:115–18, 2017.
- Y. Nakazato et al., “Nuclear grading of primary pulmonary adenocarcinomas,” Cancer, 116:2011–19, 2010.
- N. Coudray et al., “Classification and mutation prediction from non-small cell lung cancer histopathology images using deep learning,” Nat Med, 24:1559–67, 2018.
- G. Caravagna et al., “Detecting repeated cancer evolution from multi-region tumor sequencing data,” Nat Methods, 15:707–14, 2018.