Pulling Out Proteins

Troubleshooting discovery and validation of protein biomarkers for cancer.

Kelly Rae Chi
Mar 31, 2009

Researchers have long pinned hopes on biological indicators—or molecular biomarkers—to serve as tools to help clinicians identify populations at risk, classify tumor types, or monitor disease progression in cancer, which is diagnosed in an estimated 11 million people worldwide each year. Genes have held the spotlight for many years as potential biomarkers of cancer, but increasingly, researchers are turning to individual proteins or groups of proteins and their modifications with the belief that these molecules hold secrets to cancer pathophysiology.

Because protein modifications can be extremely diverse (leading to a large number of isoforms) and proteins cannot be amplified like DNA can, they don't always offer a clear molecular distinction between a cancer patient and healthy control. "There are several difficulties in the study of proteins that are not inherent in the study of nucleic acids," writes William Cho, a scientific officer in clinical oncology at Queen Elizabeth Hospital in...

"People are doing a lot of technology development to discover biomarkers," says Akhilesh Pandey, an associate professor at John Hopkins University in Baltimore, Maryland. "What has still not come across is that once you do discovery you are actually not even close to being done."

For researchers lucky enough to find biomarker candidates, the story doesn't end there: the best candidates must be chosen and validated in more patient sets, a process that involves years of work. "People are doing a lot of technology development to discover biomarkers," says Akhilesh Pandey, an associate professor at Johns Hopkins University in Baltimore, Maryland. "What has still not come across is that once you do discovery you are actually not even close to being done."

The Scientist talked with researchers about some of these problems and how they went about solving them. Here's what we found:

Eliminating Amylase

<figcaption>Saliva bubbles in a petri dish Credit: Courtesy of Arnie Rosner</figcaption>
Saliva bubbles in a petri dish Credit: Courtesy of Arnie Rosner

User: David Wong, Professor, University of California, Los Angeles School of Dentistry

Project: Searching human saliva for protein biomarkers of oral squamous cell carcinoma and other cancers

Problem: Salivary amylase is overabundant . "Most of the time what is being seen reduces the opportunity to identify lower-level proteomic information, such as cytokines and other low-molecular-weight proteins," says Wong. The group tried five different antibody-based technologies, to no avail. They needed a way to eliminate amylase and unmask potential biomarkers.

Solution: After two years, Wong's group finally found a way that seems to deplete amylase in a study published last year by a group of Israeli scientists (Electrophoresis, 29:1-8, 2008). The technique involves filtering whole saliva in an added step. The setup is a 1-mL plastic syringe filled with 1 gram of potato starch. A small 0.45-μm filter is placed at the tip. Saliva is added to the top and pressed through the syringe using the plunger. "It really works quite well," Wong says.

Cost: $10/assay once it's set up

Considerations: When the Israeli group removed amylase, they used a 2-D gel to reveal 15 previously undetected proteins close to 60 kDa. In Wong's lab, a postdoc is still working on the technique. "The data shows that, at least on a 1-D gel, we can see that amylase band disappearing," he says. Amylase has been shown to bind starches, but does the new method pull out other things as well? "The immediate question we're asking right now is, 'In addition to amylase, are other proteins remaining behind or are they being depleted?'"

Lit Probe

User: Akhilesh Pandey, associate professor, Johns Hopkins University

Project: Validating potential biomarkers for early detection in pancreatic and breast cancer

Problem: There are so many potential candidate biomarkers needing validation that it's difficult for researchers to know which ones to pick to invest time and money. The choice often involves a rushed "cherry picking" of the candidates they are most familiar with. Pandey recently served on a nonprofit panel looking to fund validation work for 60 different potential biomarkers, but it proved challenging to pick which biomarker candidates should get priority.

Solution: In collaboration with several other scientists, Pandey pored over ~50,000 published articles on pancreatic cancer. They started with a broad list of any data on mRNA and proteins in pancreatic cancer. They divided this set into type of pancreatic cancer and the cell type where overexpression was observed. Then, they asked, "Where are these molecules?" They carried out a broader literature search to pinpoint the tissues in which each molecule had been localized. Finally, they asked whether these molecules had also been implicated in chronic pancreatitis, a potential confounding factor. Pandey is planning to publish this work and make a web-based portal available with added, wiki-like features so that the cancer researchers can add in both published and unpublished results.

Cost: 7000 person-hours

Considerations: Researchers considering validation studies can employ this strategy on a smaller scale for themselves instead of rushing into experiments. "This is vital because validation experiments are time consuming, cost money and yet are not 'glamorous.' It is important to get it right by working on a candidate biomarker with a higher likelihood of success," he says.

Niche Finder

<figcaption>2D gel of C.glutamicum proteins Credit: © Research Center Jülich</figcaption>
2D gel of C.glutamicum proteins Credit: © Research Center Jülich

User: James Mobley, director of the urologic and clinical proteomics facility, University of Alabama at Birmingham

Project: Developing lipid and protein biomarkers for pancreatic cancer using serum, plasma, and urine

Problem: Mobley works with beginning biomarker researchers who are often caught in a Catch-22: In order to get funding for in-depth prospective trials, they need to spend money generating pilot data. When researchers came to him hoping to identify protein biomarkers for pancreatic cancer, they were faced with a mountain of choices, including which fluids and tissues to analyze and which platform (low-, medium-, or highthroughput) to choose.

Solution: They strategized to get one step above what others have published on blood and urine samples. Most previous studies used small, pooled sample sets combined with low-throughput platforms, such as 2-D gels, used to pinpoint and identify proteins. In this case, "there didn't seem to be a whole lot of promise there," Mobley says. For the few high-throughput approaches there were some protein differences between diseased and healthy states, but none of them investigated urine samples. "So we thought, if we do the study using [a high-throughput technology] in matched plasma, serum, and urine samples, there's probably a good chance that we'll see some differences and hopefully be able to differentiate pancreatitis versus cancer."

Cost: $7/sample for basic high-throughput profiling

Considerations: When funding is at a minimum, investigate lower-cost technologies that have yielded results, and know precisely what experimental pieces may have been missing. In the end, "in many cases, you are left with partially reinventing the wheel, but when that is the case, then at least make a better wheel," he says.

Searching Low

<figcaption>Immunohistochemical staining for multiple biomarker targets in nasopharyngeal carcinoma tissue Credit: Courtesy of William Cho</figcaption>
Immunohistochemical staining for multiple biomarker targets in nasopharyngeal carcinoma tissue Credit: Courtesy of William Cho

User: William Cho, scientific officer in clinical oncology at Queen Elizabeth Hospital, Hong Kong

Project: Comparing serum proteins of lung cancer patients who smoked and never smoked to those of healthy controls

Problem: Using protein fractionation methods, even after increasing the numbers of fractions to get better resolution, the researchers were missing low-abundance proteins present in the serum and plasma. In addition, this method is "quite tedious, less reproducible when you use the manual method, and more costly," Cho says. They wanted a method to help dig into less-abundant serum proteins.

Solution: In late 2005, Cho's lab tried a technology designed to enhance the lowabundance proteins (Equalizer Beads, a product now called "ProteoMiner beads" sold by Bio-Rad). The technology works by having beads bound to a library of millions of different hexapeptides (ligands). Each bead-peptide combination binds to a different protein in your serum sample. The number of different bead-peptides available for binding is relatively equal, so a large percentage of the highly abundant proteins don't find ligands and are washed away. The process thus equalizes the amounts of highand low-abundance proteins in your sample.

The assay can take about a week to set up, Cho says. They found a small panel of low-abundance proteins less than 1 nanogram per mL, and are working on identifying these using mass spectrometry.

Cost: $100/sample using the ProteoMiner Introductory Kit (or $72/sample using the full kit)

Considerations: Because the proteins are "equalized," you can capture them but you can't quantify them. Cho says you may still be able to quantify later on, using the original crude serum and a proteomics platform. Cho's isn't the only system out there for this; other companies such as Beckman Coulter and Sigma Aldrich sell antibody-based kits to deplete high-abundance proteins.

Membrane Proteins

<figcaption>Endothelial cells in cell culture. Credit: Courtesy of Christoph Roesli</figcaption>
Endothelial cells in cell culture. Credit: Courtesy of Christoph Roesli

User: Christoph Roesli, postdoc, Institute of Pharmaceutical Sciences, Zurich, Switzerland

Project: Identification of membrane protein biomarkers to develop antibody-based therapies for cancer

Problem: Membrane proteins are often difficult to identify using proteomics tools because they are present in low abundance and less soluble in water than intracellular proteins. What's more, they often come with chemical modifications, such as glycosylation attachments and bonds formed between cysteine amino acids on the proteins. Such modifications can hinder proteomics studies by making proteins difficult to digest with enzymes and difficult to analyze.

Solution: About five years ago, Roesli's colleague developed a way to capture proteins of interest by flushing the whole blood system of living mice with biotin, which modifies proteins on endothelial cells and in close proximity to blood vessels. After collecting and processing tissue of interest, they capture the proteins using a streptavidin sepharose column.

Last year, Roesli wanted to further improve the biotinylation reaction by adding two steps to break cysteine bonds and to remove N-linked glycan chains, respectively. After several rounds of optimization, the combination of tris-(2-carboxyethyl) phosphine and N-ethylmaleimide for cysteine bond breakage and PNGase F for the removal of N-linked glycans worked best.

Cost: less than $100/sample for the whole process

Considerations: Commercial kits are available for the biotinylation reactions, but, Roesli says, "almost all people I talked with say they have problems with the systems." If you're new to the method, don't always trust the kits. Expect to spend at least a month or two optimizing conditions in your own lab.

Main techniques for cancer biomarker discovery and validation

DescriptionProsCons
Two-dimensional gel electrophoresis An established method for identifying proteins by separating them based on charge across a defined pH gradient in one direction, and then by mass in the other direction. 1) Ability to simultaneously monitor thousands of proteins 1) Lack of throughput potential and reproducibility
2) Compatible with various staining methods for protein identification2) Difficulties in resolving highly basic, high-molecular-weight, or low-abundance proteins
Matrix-assisted desorption/ionization time of flight (MALDI-Tof) mass spectrometry A method that ionizes proteins while keeping them intact, this can be used to identify proteins by mass, and in some variations, by sequence. 1) Same-day data acquisition 1) Usually compatible with a narrow range of protein masses
2) Highly automated, good for large numbers of samples 2) Peak intensity varies inexplicably between different proteins and experiments, but a two- to four-sample replicate may improve reproducibility
Enzyme-linked immunosorbent assay (ELISA) The traditional method of validating candidate biomarkers, this is an antibody-based method that captures and helps in quantification of specific proteins. 1) Commonly used for low-abundance proteins; 1) Dependent on antibody availability or quality
2) Can detect proteins in the picomolar to nanomolar range2) Too expensive to implement on a large scale