DNA Expression Profiling: A New Lens on Cancer

Lumps, bumps, and unusual marks have long heralded cancer, from a bulging jaw in an australopithecine fossil, to traces of melanoma in a 2,400-year-old Incan mummy, to the frightening discovery of a dimpled breast today. Since the 1970s, a portrait of carcinogenesis has emerged from a series of genetic insults that pushed cells to proliferate, invade, and spread. Today, gene expression profiling is expanding that view to embrace a waxing and waning of protein levels that provide a dynamic bac

Ricki Lewis
Sep 21, 2003

Lumps, bumps, and unusual marks have long heralded cancer, from a bulging jaw in an australopithecine fossil, to traces of melanoma in a 2,400-year-old Incan mummy, to the frightening discovery of a dimpled breast today. Since the 1970s, a portrait of carcinogenesis has emerged from a series of genetic insults that pushed cells to proliferate, invade, and spread. Today, gene expression profiling is expanding that view to embrace a waxing and waning of protein levels that provide a dynamic backdrop to the choreography of cancer. And charting those changes may have predictive value.

"Microarray-based gene expression analyses are showing us that we probably do not have a firm understanding of the stages of carcinogenesis and metastasis. Individual cancers are likely to have metastatic potential very early in their natural history," says Jeffrey Boyd, director of the gynecological and breast research laboratory at Memorial Sloan-Kettering Cancer Center in New York. That is, a metastatic fate may be sealed long before its detection, affirming the importance of early diagnosis.

Oncologists have long known that patients with similar-appearing tumors can have wildly different experiences. A diagnosis of chronic lymphocytic leukemia, for example, could mean a rapid decline and demise--or years of life. So it makes sense that gene expression profiling would reveal tumor heterogeneity. But there have been surprises, too.

TECHNOLOGY EVOLVES Cancer evokes imagery, and the older ways to peek at the disease focused on the observable. Hippocrates introduced the Greek "karkinos" for crab, to capture the characteristic grasping growth of a malignant tumor. Nineteenth century English surgeon Stephen Paget applied a "seed and soil" metaphor to describe how a cancer sends bits of itself to nest elsewhere.1 Seeking causes has been colorful too. Over the years, cancer has been attributed to excess bile, fermenting lymph, injury, irritation, and simply "melancholia."

In the 1930s and 1940s, pathologists subtyped leukemias by nucleus shape. In the 1960s, histochemical techniques distinguished lymphoid from myeloid lineages, refined a decade later with antibodies. The 1970s were also when attention turned to genes. Researchers superimposed sequences of mutations in proto-oncogene and tumor suppressor genes onto classical histopathological stages of disease progression. It has turned out to be only a bare-bones explanation.

"When we first learned about oncogenes, the focus was on looking for the one gene that causes cancer. It is clear now that's not the best way to think about cancer," says Louise Showe, a professor at the Wistar Institute in Philadelphia who has identified a poor-prognosis 10-gene expression signature, from thousands of screened genes, for a rare skin lymphoma.2 Excess integrin B1, proteoglycan 2, RhoB oncoprotein and a pair of transcription factors, as well as deficiency of CD26, stat-4, and interleukin 1 receptors, presage a poor outcome. A surprise finding was plastin-T, not normally in lymphoid tissue, but in the tumor.

The idea to monitor the ebb and flow of proteins that parallels histological change originated more than two decades ago.3 Leonard Augenlicht, professor of medicine and cell biology at the Albert Einstein College of Medicine and Cancer Center in New York, used hybridization kinetics and autoradiography to screen 400 DNA sequences and produce an expression profile on an array--but a macro one. The patent describes monitoring cancer progression and response to chemotherapy.4 But even after building 4,000-probe arrays, the approach didn't catch on. "It was the mid-1980s, and oncogenes were just being extensively investigated. The thinking was, all we had to know was what ras and myc were doing, and that was it," recalls Augenlicht.

Eventually, biotech and pharmaceutical companies discovered the patent, and Augenlicht licensed the technology to Incyte Corp. in Palo Alto, Calif. Then microarrays entered the lexicon, and multigene analyses, of genes present or expressed, flourished.

By 1995, Bert Vogelstein, Ken Kinzler, and colleagues at Johns Hopkins University devised serial analysis of gene expression (SAGE), a cataloging of transcripts in a particular cell at a particular time.5 Soon SAGE was applied to cancer, with Eric Lander's group at Massachusetts Institute of Technology pioneering distinguishing known leukemias and then identifying new types, without histopathological guidance.6 In 2002, they discovered that a subset of acute lymphoblastic leukemia (ALL) cases is actually a distinct type, mixed-lineage leukemia (MLL). Compared to other ALLs, the MLLs have a unique pattern of 200 overexpressed and 1,000 underexpressed genes, including unique hox gene mutations. The work identified a suite of new potential drug targets for this leukemia that often kills in infancy--perhaps its deadliness stemmed from its being lumped with ALL on the superficial grounds of similar morphology. A genetic analysis adds precision.

PROBING GENE EXPRESSION IN CANCER On an array, sample mRNA binds specific DNA sequences, the strength of the fluorescence signal reflecting degree of gene expression. Creativity enters in selecting probes and patients.

The more probes, the more one can learn, researchers say, because cancer cells hold surprises. For example, Lander's group found among the expected oncoproteins, transcription factors, chromatin remodelers, and cell-cycle controllers the leptin receptor. Its anti-apoptotic effect on hematopoietic cells explained its upregulation in acute myeloid leukemia. In another study, Stephen Friend and colleagues at Rosetta Inpharmatics in Kirkland, Wash., scanned 25,000 genes, identifying 5,000 that vary in expression at least twofold in cancer, developing a 70-gene "prognostic signature" of poor outcome in breast cancer. Some of the findings were unexpected, such as upregulated metalloproteinases, while expression of some usual suspects in breast cancer, such as cyclin D1, HER-2/neu, and c-myc, was noticeably absent.7

Choice of tumor material also affects the power of gene expression analysis. Experiments compare patients with the same diagnosis, cancerous to noncancerous tissue, track different regions of a tumor, or provide glimpses before and after treatment. Computers then seek patterns in the data. Algorithms select hierarchical clusters of similarly expressed genes that might be characteristic of only one group, or correlate a pattern to an outcome, such as metastasis or survival. A so-called neighborhood analysis selects one gene expressed significantly differently in two groups, adding others with similar profiles. The leave-one-out approach tests gene sets minus one gene at a time to see if a correlation to outcome persists, confirming the cluster's integrity.

LIMITATIONS AND CHALLENGES Finding a gene expression pattern is just a start; validation comes next, avoiding the elevation of a statistical fluke into a conclusion. "If one studies two groups of 50 individuals for differences among 50,000 genes, by random chance some differences will be found. Therefore, any microarray study done on a limited human population must be regarded as 'hypothesis generating,' until the gene differences found can be reproduced in a new set of individuals," explains Sandy Markowitz, a Howard Hughes Medical Institute investigator at Case Western Reserve University in Cleveland. The Rosetta Inpharmatics researchers, for example, applied the prognostic signature derived retrospectively from an initial 98 women to a new group of 231. The correlation to poor outcome prevailed.

A major limitation of gene expression profiling is that mRNA levels may not provide an accurate snapshot of the state of the cell. "Microarrays look at the level of RNA and not protein, and not at whether the proteins are phosphorylated, which must happen in signal transduction pathways for activity," explains I. Bernard Weinstein, professor of medicine at Columbia University. The approach may miss poorly expressed but vital genes.

Another challenge is procedural--sampling meaningful cells. Fluorescence-activated cell sorting can separate cancer cells from blood, but a solid tumor is a different story. "If you grind up a tumor to extract the RNA, you don't know where the signal is coming from. You must microdissect the tissue first," says Dennis Sgroi, an assistant professor of pathology at Harvard Medical School, who uses laser capture microdissection and RNA amplification.

Christine Iacobuzio-Donahue, an instructor of pathology at Johns Hopkins Medical Institutions, obtained transcriptional profiles to distinguish five regions in breast tumors--the actual cancer cells, the vasculature, inflammatory cells, surrounding stroma, and specialized stroma that touch the cancer cells, enticing them to invade. She found a similar tumor anatomy in the pancreas. "Rather than deal with overwhelming lists of genes generated from tumor tissues, we look at compartments of expression and potential regions of crosstalk. Our data indicate that tissue invasion is a more coordinated, highly regulated process than previously recognized," she says.

TUMORS ARE THE SAME--AND DIFFERENT So complex is a cancer's spread viewed under the new lens of gene expression that it can be difficult to reconcile results from different studies. Interpretation often depends on the groups being compared. David Beer, Samir Hanash, and colleagues at the University of Michigan, Ann Arbor, identified sets of highly expressed genes that correlate with good or poor survival in 86 patients with adenocarcinomas, for which traditional histopathology cannot distinguish tumors likely to spread.8 As expected, the more highly differentiated tumors usually expressed the good gene cluster. But the poor-prognosis group included a few well differentiated, presumably young, tumors, indicating that Paget's seeds of metastasis may indeed be sown early--a finding echoed in several other cancers.

Rather than searching for gene expression signatures in tumors from many individuals, Sgroi's group worked with patients whose breast tumors included cells of different grades and stages, so that the participants could be their own controls.9 The three grades are well, moderately, and poorly differentiated, with corresponding higher mitotic index, and the three stages are premalignant, preinvasive, and invasive. Results were surprising.

"We expected to see different stages correlate with particular patterns of gene expression, but we didn't see that at a global level. No consistent transcriptional program differentiates the stages. For example, we expected collagenase genes to be turned on for invasion only. But the gene expression for ductal carcinoma in situ is highly similar to that of invasive cancer. This means that invasive potential is already there in the preinvasive stage," Sgroi says. The fact that the tumor grades have "distinct transcriptional signatures" indicates that degree of differentiation is perhaps more important in setting outcome than the degree to which the tumor has invaded and spread. Yet the study did find an eclectic group of genes that are upregulated as the preinvasive stage becomes invasive and mitotic activity soars, including those encoding a centromere protein, a proteasome subunit, a topoisomerase, and various kinases.

Gene expression profiling of cancer is fleshing out the long-held, neat image of a Darwinian progression of new mutations spawning ever more aggressive clones. The idea that the course of cancer is set early on is certain to have vast clinical repercussions. Sums up Showe: "This is going to be a powerful new way to personalize medicine. Being able to look at many genes simultaneously to catch all the shades of the disease will be important for improving cancer diagnosis and treatment. We need to be looking at a bigger picture, and with this technology we are going to be able to do that."

Ricki Lewis (rickilewis@nasw.org) is a freelance science writer and textbook author in Scotia, NY.

References
1. S. Paget, "The distribution of secondary growths in cancer of the breast," Lancet, 1:571-3, 1889.

2. L. Kari et al., "Classification and prediction of survival in patients with the leukemic phase of cutaneous T cell lymphoma," J Exper Med, 197[11]:1477-88, June 2, 2003.

3. L. Augenlicht, H. Halsey, "Cloning and screening of sequences expressed in a mouse colon tumor," Cancer Res, 42:1088-92, 1982.

4. US Patent 4,981,783

5. V.E. Velculescu et al., "Serial analysis of gene expression," Science, 270:484-7, 1995.

6. S.A. Armstrong et al., "MLL translocations specify a distinct gene expression profile that distinguishes a unique leukemia," Nat Genet, 30:41-7, 2002.

7. L.J. van't Veer et al., "Gene expression profiling predicts clinical outcome of breast cancer," Nature, 415:530-6, 2002.

8. D.G. Beer et al., "Gene-expression profiles predict survival of patients with lung adenocarcinoma," Nat Med, 8:816-24, 2002.

9. X-J. Ma et al., "Gene expression profiles of human breast cancer progression," Proc Natl Acad Sci, 100:5974-79, May 13, 2003.