Classifying Breast Cancer Models

Image: Anne MacNamara The exciting use of cDNA microarrays to reveal molecular subclasses of human tumors has spread to the study of animal models that mimic human tumors. With unsuspected subclasses of human lymphomas, melanomas, colon carcinomas, and breast carcinomas uncovered, researchers naturally have been inspired to apply microarray analysis to animal tumor models that in some instances have been studied for decades. How closely, they wonder, will experimental tumors resemble human tumors in details of gene expression? Complete answers will take years, if for no other reason than the extraordinary number of models investigators need to examine. For breast cancer alone there are at least 100 different mouse mammary tumor models, says Jeffrey E. Green, head of the Transgenic Carcinogenesis Group at the National Cancer Institute (NCI). Green, Richard Simon, chief of NCI's Biometric Research Branch, and their colleagues recently published the first analysis of gene expression profiles of mouse mammary tumors.1 Their work is a foretaste of what the marriage of microarrays and tumor models will bring. Analyzing mouse mammary tumors initiated in transgenic mice by overexpression of five transgenes--c-myc, c-neu, c-ha-ras, polyoma middle T antigen (PyMT), and SV40 T antigen (T-ag)--they found that each model shared gene expression similarities with human breast cancers. Not a surprising result, perhaps, but they added to this a deeper insight, one especially notable for those interested in methods of microarray data analysis. By avoiding what Simon calls misapplication of cluster analysis, the researchers discovered that the five oncogenes initiate three distinct gene expression profiles, what they call oncogenic signatures. LIMITATIONS OF CLUSTER ANALYSIS Their models were chosen for practical reasons. SV40 T antigen inactivates p53, a tumor suppressor frequently mutated in human breast cancers. About one-third of human breast cancers overexpress the epidermal growth factor receptor erbb2/her2/neu, and 17% overexpress c-myc. Leading the project, Kartiki V. Desai used two microarrays, GEM1 from Incyte Pharmaceuticals, and the mouse oncochip, produced under Green's supervision at NCI, which together held 8,680 different mouse genes. For each model, Desai examined mammary tumors from four to six mice, accumulating an estimated 700,000 gene expression measurements. The researchers began their data analysis by asking which genes were expressed differently between mammary tumors and normal tissues, defining different as at least twofold higher or lower than normal. Of 903 gene expression differences between cancerous and normal states, the most striking, found in every model, were heightened expression of genes involved in glycolysis and metabolism. They also found higher expression of translation elongation factors, cell cycle regulators, signaling receptors, G proteins, and transcription factors, confirming, says Green, "what has been known for decades: That cancer increases the rate of many metabolic pathways." Despite different tumor-initiating oncogenes, the models had many similarities. Cluster analysis (average linkage hierarchical clustering, the most common method) based on the 903-gene subset revealed that profiles were highly correlated (> 0.8) within models, and almost as high between models (> 0.7). When they examined genes with known involvement in human breast cancer, researchers found that some genes were induced in all models. An example is the gene for acid beta-glucosidase; drugs that inhibit the enzyme reduce metastasis. Other induced genes were restricted to specific models; aldolase C, for instance, associated with human noninvasive breast cancer (ductal carcinoma in situ), was induced only in c-neu tumors. Discovering that the models had underlying differences, despite similarities, required a detour away from cluster analysis--a road to analysis that not everyone would have followed. "Many researchers would have simply applied cluster analysis to all 8,600 genes to see which expression profiles clustered to each other," Simon observes. Costly mistakes are made that way: "You can miss what you're trying to see." A hazard in cluster analysis is the old bugaboo, signal-to-noise. "If you have relatively small numbers of genes reflecting differences between two kinds of mice, then you may not see them with cluster analysis," Simon says. "They may be washed out by the tens of thousands of genes that are not different." As he explains in a recent paper on microarray experimental design,2 numerous sources of variation (e.g., age of the animals, the way the RNA was handled) can weaken the precision of measurements. "All those sources of noise tend to swamp out the differences you care about." Instead of cluster analysis, they used a standard statistical test, the F test, asking for each gene, says Simon, "Is this gene differentially expressed among the tumor models?" (The F test asks if differences in multiple means are greater than predicted by normal variation. The more familiar T test asks if the difference between two means is greater than predicted by normal variation.) F tests revealed differential expression in 930 mouse mammary tumor genes. Key to believable results with the F test was requiring that statistical significance be 0.1%, a far more demanding standard than the routinely used 5%. Simon explains, "If you select genes using a 0.05 statistical significance standard and use 10,000 genes, you may wind up with 500 false positives." If one selected 0.001 instead, only 10 false positives would be expected. SEEING IS BELIEVING Avoiding false positives still left 930 genes, more than enough to make it "hard to visualize with gene expression measurements how similar one tumor was to the next," says Simon. So they returned to cluster analysis with multidimensional scaling, a tool for visualizing clusters. "As long as you limit the number of false positives," he says, "it is a reasonable tool for visualizing your data. In this case it permitted us to identify classes we would not have seen had we just relied on the statistical tests." Multidimensional scaling "reduces the data to a set of pairwise distances that can be displayed graphically," Simon explains, "allowing you to look at tumors in terms of expression profiles, and which ones are similar." Each mammary tumor is represented by a point. Distances between points reflect differences in expression of the 930-gene subset; the closer the points, the more similar the expression profiles. When points and distances were displayed in three dimensions, three clusters (three categories of tumors, says Green) emerged. One category was overexpression of SV40 T antigen. A second was tumors overexpressing c-myc. The third category grouped the remaining tumors, those created by c-neu, c-ha-ras, and PyMT. ONCOGENIC SIGNATURES The SV40 T antigen tumors made up the largest category. More than 100 genes in its oncogene signature were absent from the other categories. Altered regulation was noted for genes governing the cell cycle (G1/S and G2/M transitions), DNA replication, and apoptosis. Unique aspects of gene expression were the relatively minor disturbance of DNA repair and induction of a variety of calcium-binding proteins. Overall, the oncogenic signature indicated that tumors arose from breakdown in cellcycle regulation. The c-myc category in some respects resembled the T antigen category, inducing some of the same genes, notably genes for cellcycle control. Where the c-myc category was different was in altered expression in a number of known c-myc targets such as c-fos and dihydrofolate reductase, upregulation of a wide variety of transcription factors, and induction of rRNA. In the final group, c-neu, c-ha-ras, and PyMT tumors exhibited high degrees of overlap in gene expression, consistent with tumor mechanisms based on perturbing ras pathways involved with mitogen stimulation of cell proliferation. Many G proteins, GTPase activating proteins (GAPs) and serine-threonine kinases were induced. Oncogenic signatures were not associated with altered DNA replication or G2/M cell cycle regulation. Missing from the analysis was an explanation for the high incidence of metastasis in PyMT tumors. "It wasn't obvious from the pattern of gene expression," says Green, "but keep in mind that this only represented a subset of genes in the mouse genome. Many genes have not been analyzed." And, of course, he hastens to add, the explanation might lie outside the level of gene transcription. Free Software Find BRB ArrayTools and technical reports on microarray analysis online at linus.nci.nih.gov/~brb. BETTER MODELS, BETTER ANALYSIS "There is no clear distinction as to which model more closely mimics what happens in human breast cancer," Green says. The next experiments will examine gene expression profiles of mammary tumors with mutations in p53 and BRCA1, one of the genes for inherited breast cancer susceptibility. From this work Green foresees better matching of models to particular experimental questions. For instance, SV40 T antigen tumors might be the best for preclinical testing of anticancer drugs based on calmodulin inhibition, because those tumors induce calcium-signaling pathways. Green and Simon say their current work remains incomplete until there is an analysis of human vs. mouse microarray data. "Now we're trying to make statistical correlations between data generated for human breast cancer and data that we generated for mouse models," says Green. It will be painstaking labor. The obvious problem in microarray comparisons is the correspondence problem--which mouse genes correspond to which human genes. The internal reference problem will be harder, says Simon: "With microarrays you have some sort of internal reference. Measures of gene expression are usually relative to the internal reference." Here the internal reference was derived from pooled normal murine epithelial mRNA. "When we use human arrays," Simon says, "we will either have to use pooled human epithelial mRNA or use an adjustment that takes into account that the internal references are different." It won't be simple. But it is worth doing, he says, "because after all, it is what you want to know: Which of these are good models of human disease." In the meantime, the Biometric Research Branch where Simon works hopes to lessen the confusion about microarray data analysis with a software package called BRB ArrayTools, free to scientists at nonprofit institutions. The data analysis and visualization software "is our attempt to clone our experience," Simon says, "the lessons learned from years of analyzing microrarray data." He hopes the software will encourage better approaches to data analysis, noting that much of what is available, even commercial software, encourages just the opposite. That his nonstatistically minded colleagues could use some friendly advice he does not doubt. "Biologists for the most part are analyzing their own data," he observes, "and for the most part doing a very bad job of it." Tom Hollon (thollon@starpower.net) is a freelance writer in Rockville, Md. References 1. K.V. Desai et al., "Initiating oncogenic event determines gene-expression patterns of human breast cancer models," Proceedings of the National Academy of Sciences, 99:6967-72, May 14, 2002. 2. R. Simon et al., "Design of studies using DNA microarrays," Genetic Epidemiology, 23:21-6, June 2002. MICROARRAY MYTHS AND TRUTHS Myths That the greatest challenge is managing the mass of microarray data; That pattern recognition or data mining are the most appropriate paradigms for the analysis of microarray data; That cluster analysis is the generally appropriate method of data analysis; That comparing tissues or experimental conditions is based on looking for red or green spots on a single array; That reference RNA for two-channel arrays must be biologically relevant; That multiple testing issues can be ignored without filling the literature with spurious results; That complex classification algorithms such as neural networks perform better than simpler methods for class prediction; That prepackaged analysis tools are a good substitute for collaboration with statistical scientists in complex problems. Truths The greatest challenge is organizing and training for a more multidisciplinary approach to systems biology. The greatest specific challenge is good practice in design and analysis of microarray-based experiments. Pattern recognition and data mining are often what you do when you don't know what your objectives are. Effective microarray-based research requires clear objectives. Cluster analysis is useful for some types of studies, such as finding potentially coregulated genes. For most microarray studies, however, supervised methods of analysis are much more powerful. Comparing expression in two RNA samples tells you only about those samples and may relate more to sample handling and assay artifacts than to biology. Robust knowledge requires multiple samples that reflect biological variability. The reference generally serves only to control variation in the size of corresponding spots on different arrays and variation in sample distribution over the slide. Comparing two classes of samples with regard to expression of 20,000 genes, one expects 1000 erroneous findings of genes that appear differentially expressed at the 5% significance level. This is true regardless of the correlation patterns of the genes. Eyeball analysis of multicolored image plots for genes that appear differentially expressed is no more reliable. "Artificial intelligence" sells to journal reviewers and institute leaders who cannot distinguish hype from substance when it comes to data analysis. But comparative studies have shown that simpler methods work better for microarray problems where the number of candidate predictors greatly exceeds the number of samples. Biologists need both good analysis tools and good statistical collaborators. Both are in short supply. --Richard Simon

Classifying Breast Cancer Models

Interested in reading more?

Become a Member of