Genetic mutations are a hallmark of cancer. Aided by advances in genomics and sequencing, researchers have therefore worked to identify mutations that drive cancer and to correct for them.
Determining the function of mutated “cancer genes” has led to notable successes in treatment. Imatinib (Gleevec) counteracts the negative effects of a gene fusion in chronic myelogenous leukemia; trastuzumab (Herceptin) treats breast cancers with amplifications of the gene coding for the human epidermal growth factor receptor 2 (HER2); and gefitinib (Iressa) and erlotinib (Tarceva) treat cases of lung cancer positive for mutations in the gene for epidermal growth factor receptor (EGFR).
Some researchers fear, however, that they have already picked the low-hanging fruit when it comes to identifying individual mutated genes that can serve as cancer drug targets, and that relevant new targets will be much harder to find. Indeed, one 2014 analysis estimated that researchers have nearly reached saturation in their search for the relatively small number of genes that are mutated more than 20 percent of the time in patients with a given tumor type (Nature, 505:495-501). But as the sequencing of ever-larger numbers of tumors continues, less-common mutations will continue to be identified, the report predicts.
Researchers estimate that cancer genome panels identify at least one actionable mutation in 40 percent to 60 percent of patients with common solid tumors (Cell, 153:17-37, 2013). But even cancers that have mutations that can be targeted using precision drugs eventually become resistant. And even seemingly uniform cancer subtypes, such as HER2-positive breast cancers, can show variable responses to drugs.
To combat these problems, researchers are getting craftier in their search for drug targets, applying new algorithms and systems-biology approaches to parse masses of data on what makes cancer cells tick. “There has been recently a quite big movement from looking into pure enrichment of mutations in genes to finding enrichment in mutations in functional elements,” says Martin Miller, a cancer systems biologist at the University of Cambridge in the U.K. This may mean searching for mutations that are not all in the same gene, but instead may be in protein domains with a common function; or searching for different mutations in a single, shared pathway.
Investigators are also looking beyond mutations that alter specific proteins and thinking about how changes to the expression of broad groups of genes may underlie or promote cancer.
“There’s now generally a move towards large whole-genome, whole-transcriptome profiling,” says Paul Scheet, a statistical geneticist at the University of Texas MD Anderson Cancer Center in Houston. “The data are cost-effective enough to generate en masse, and that puts the burden on analysis, or on creatively putting things together and looking for patterns that might be more subtle effects [than those] we would get from single genes and single mutations.”
Below, The Scientist outlines four strategies for hunting down new cancer targets and the insights they have generated.
DIGGING FOR DOMAINS
Strategy: Look for types of protein domains that are repeatedly mutated in cancer
How it works: Finding common mutations associated with cancer is relatively easy, as it only requires examining a relatively small group of tumor genomes. But most genes are mutated only rarely in cancer, leaving researchers wondering whether mutations to these genes are simply inconsequential passenger mutations or whether they contribute to cancer development and progression.
In hopes of figuring out the significance of mutations that only show up occasionally in tumors, Miller and his colleagues investigated whether those mutations occurred disproportionately in certain protein domains (Cell Systems, 1:197-209, 2015). “Protein domains are these very well-conserved sequence stretches that are passed down in evolution through duplication,” Miller says. “The domain can appear in several different genes. By that principle you enhance statistical power.”
From The Cancer Genome Atlas (TCGA), a community resource run by the National Institutes of Health, Miller and his colleagues gathered sequence data on more than 5,000 pairs of tumorous and healthy tissue from individuals with 22 different cancers. The researchers then cross-referenced these data with definitions and maps of protein domains cataloged in Pfam-A, a protein family database, and looked for domains with unexpectedly high numbers of missense mutations—changes to DNA that lead to an amino acid substitution. They uncovered novel cancer-related mutations that were linked to previously known cancer mutations in shared domains. For instance, it was already known that HER2 can have mutations at a particular spot, and the researchers found that the same domain has a potentially cancer-related mutation in the related genes HER3 and HER4. With this knowledge, researchers will be able to keep an eye out for cancer patients with mutated HER3 or HER4, and possibly treat them with drugs developed to treat people with HER2-positive cancer, says Miller. The researchers also found mutations in domains in genes that have never before been associated with cancer, such as in genes containing the peptidylprolyl isomerase domain, which is involved in folding and transporting proteins.
Miller’s group has built a web tool called MutationAligner that other researchers can use to explore cancer-related mutations in protein domains (Nucleic Acids Res, 44:D986-91, 2016). Users can search protein domains of interest to see what cancer-related mutations they carry, or they can search genes of interest to see if they contain mutated domains.
Strategy: Identifying genes or groups of genes with upregulated or downregulated expression in cancer
Tool: Cancer in silico Drug Discovery (CiDD)
How it works: Scheet, Vilar-Sanchez, and their colleagues built CiDD to look at gene-expression signatures in tumor subtypes and to match these gene-expression changes with drugs that may be able to reverse them (Mol Cancer Ther, 13:3230-40, 2014).
The researchers were inspired to build CiDD after realizing that many publicly available data sets were not being used to their full potential, due to the difficulty of integrating data between them. CiDD pulls together cancer-specific sequence and gene-expression data from the NIH’s TCGA; data on links between gene-expression changes and drugs from the Broad Institute’s Connectivity Map (CMap); and information on available cell lines from the Cancer Cell Line Encyclopedia (CCLE), a project of the Broad Institute and Novartis.
“There were these three awesome resources sitting in the cloud . . . and what CiDD does is it puts together a framework for these three to communicate with each other,” says Scheet.
When a researcher interested in a particular mutation inputs it to CiDD, the program first sifts through the TCGA data and downloads the gene-expression signatures from samples with the same mutation, searching for patterns seemingly induced by the mutation.
Next, the CiDD algorithm uses CMap to attempt to match the gene-expression pattern with “a drug that induces an opposite gene-expression signature,” says Anthony San Lucas, a former graduate student of Scheet and Vilar-Sanchez and now a postdoc at MD Anderson. Finally, once CiDD has come up with a potential drug to reverse the cancer-associated gene expression, the tool searches CCLE and determines which cell lines will be appropriate to use for experiments.
“You can identify a candidate drug without running any experiments of your own,” says San Lucas. This can be a boon for small labs that may not have the capacity for high-throughput drug screening.
A drug that reverses an entire network of misregulated gene expression should be less prone to the development of resistance than a drug directed at a single mutation. “It’s not relying on just one gene,” says Vilar-Sanchez. CiDD takes into account “the expression of many things happening at the same time.”
The researchers tested CiDD by asking it to find drug candidates for treating colorectal cancers with mutations in BRAF, a gene that encodes a signaling protein that modulates cell growth. The system suggested EGFR inhibitors and proteasome inhibitors, two classes of drugs that had already been identified as possible BRAF treatments.
Vilar-Sanchez is now working on applying CiDD to colon cancer prevention. Rather than searching for already-recorded gene expression patterns in TCGA, he is inputting gene-expression data from preneoplastic polyps that often develop into malignant tumors. He has already identified drugs that might make the polyps regress and is testing them in animals. Eventually, patients at high risk for developing colon cancer could undergo preventive treatment.
MASTERS OF REGULATION
Tools: ARACNe, MARINa
Strategy: Identifying master regulators—transcription factors or signaling proteins that serve as nodes in regulatory networks involved in maintaining a cancerous state
How it works: Califano sees cancer as a type of homeostatic state peculiar to the disease. Just as liver cells maintain the function of the liver in a constantly changing environment, regulatory loops help cancer cells maintain their cancerous identity regardless of surrounding events.
According to this theory, the key to disrupting the cancer state is to identify master regulators, proteins that control the expression of many downstream genes within these transcriptional loops.
In July 2015, Califano and Jose Silva, an associate professor of pathology at Mount Sinai’s Icahn School of Medicine, uncovered a master regulator in HER2-positive, hormone receptor–negative breast cancer (Genes Dev, 29:1631-48, 2015). HER2-positive breast cancers can either be hormone receptor (HR)–positive or –negative depending on whether or not they have receptors for estrogen. HR-negative breast cancers have a higher death rate within five years than hormone receptor–positive breast cancers.
The researchers used RNA interference (RNAi) to systematically knock down gene expression in both cancerous and noncancerous cultures of human mammary cells, identifying 355 genes that apparently lowered the viability of the HER2-positive, HR-negative cancer cells when knocked down.
To narrow down this gene list, the researchers also took a systems-biology approach. Califano had previously come up with an Algorithm for the Reconstruction of Accurate Cellular Networks (ARACNe), a program that defines regulatory and signaling networks in cells based on gene-expression data. The researchers applied ARACNe to 359 breast cancer samples with expression data recorded in NIH’s TCGA.
The team next combined that information with what they had learned about which genes are essential for cancer cell viability. They used another one of Califano’s programs called the Master Regulator Inference Algorithm (MARINa) to determine whether any of the essential genes were master regulators in regulatory or signaling networks.
By the end of the process, the researchers had zeroed in on a single gene called STAT3, which codes for a transcription factor and master regulator seemingly key to maintaining cancer cells’ homeostatic state. STAT3 is part of the JAK-STAT pathway by which cells transmit signals from outside of the cell to influence gene expression in the nucleus. The JAK-STAT pathway is a positive feedback loop. “You turn on this vicious loop when you turn on HER2,” says Califano. “This keeps going even if you turn off HER2.”
A drug called ruxolitinib (Jakafi), approved by the US Food and Drug Administration in 2011 to treat a bone marrow disorder, blocks a tyrosine kinase known as JAK2. As a result of Califano’s research, Columbia doctors are currently testing a combination of ruxolitinib and trastuzumab (Herceptin) in a Phase 1/2 clinical trial to treat metastatic HER2-positive breast cancer.
This finding by Califano and his team on HER2-positive, HR-negative breast cancer is just one in a long string of successes using various combinations of RNAi, ARACNe, and MARINa to ferret out master regulators involved in cancers. For instance, the deadly brain cancer glioblastoma multiforme is also driven by a regulatory module that includes STAT3 and the transcription factor C/EBPβ (Nature, 463:318-25, 2010). And in December 2015, Califano’s team showed that HDAC6, a deacetylase that modifies the epigenetic states of histones, is a master regulator driving inflammatory breast cancer (Breast Cancer Res, 17:149, 2015). A clinical trial for an HDAC6 inhibitor is now enrolling patients.
“We’ve basically identified that there’s only a very small number of aberrant tumor states and they are very common across different types of tumors that you would have never put together,” says Califano. “We target not the initiating mutation, but we target the proteins that maintain the tumor state.”
Strategy: Searching for genes that simultaneously remain highly conserved in their sequence but change in expression
How it works: Death from cancer most often results from metastasis and the emergence of drug resistance. Austin and his colleagues set out to find why cancer becomes resistant to chemotherapy.
The researchers studied multiple myeloma, a cancer that begins in the honeycombed structure of the bone marrow. Small groups of cells isolated in the bone marrow’s nooks and crannies can easily evolve chemotherapy resistance. The researchers built a bone marrow–like labyrinth made up of hexagons connected to each other via narrow passageways just wide enough for the cancer cells to pass through. They then treated the cancer cells growing in the hexagonal chambers with gradients of the chemotherapy drug doxorubicin and waited for the cells to develop resistance.
Austin and his colleagues reasoned that some genes may be so important to keeping cancer cells alive in the face of threat that they would never change, so they decided to look for genes that were extraordinarily conserved in their sequence. “Which regions are nonmutated but upregulated or downregulated?” says Amy Wu, formerly a graduate student at Princeton and now a postdoc at the National Institute of Standards and Technology. “Those genes may play a role in elevated resistance.”
The researchers found that, in drug-resistant cells, genes that were both highly conserved and undergoing changes in expression tended to encode proteins key to the everyday functioning of healthy cells. For instance, the downregulated gene PSMC1 is involved in orchestrating DNA damage response and the cell cycle, while the upregulated gene PGK1 is involved in phosphorylation and glycolysis. Austin’s team also noticed that these conserved genes tend to have relatively ancient evolutionary origins.
The researchers suggest that people developing drugs for cancer might want to look at targeting these ancient, conserved genes. “What we want to do is not kill them, but ramp the regulation back [up or] down to a more normal level,” says Austin.