1. Try orthagonal approaches
Every assay has strengths and weaknesses, so it's a good idea to try more than one, if possible. ChIP-on-chip is significantly cheaper than ChIP-Seq, but unless you tile the entire genome, you will miss any region not represented on your chip. HELP probes only a fraction of the genome, but a genome's worth of testable HpaII fragments will fit on a single array, making analysis relatively inexpensive. (Greally now uses a 1.32-million element Nimblegen array.) You can use other assays (not described here) to double-check your results (see #2).
Generating data is not difficult, says Greally. "The challenge isn't the assay, it's what you do afterwards. People don't pay sufficient attention to validation," he says. As with gene-expression microarrays, it is wise to pick a subset of data points and check them...
3. Build collaborations
Genome-scale experiments yield data in volumes most biologists are completely unprepared to handle. "Forty-four million reads is actually more than we can comprehend," says Jones. "If you were to use BLAST on a powerful Mac, that alone would take about 12 years to process the data." Young's advice: Follow "the biologists' tradition" of focusing on the biology and farming out the rest. "That's how many people solve this problem â?? they reach out to colleagues and establish multidisciplinary collaborations."
4. Focus on experimental design
An abundance of data cannot compensate for a poorly designed study. Zamore says, "Make sure you're comparing two things that can meaningfully be compared before you bother generating two million sequences." For Zamore, that meant deciding on which fly strains to use. "When you compare two flies, if the number of siRNAs against some transposon changes, is that because the transposon content of the two strains is different? Or, it could be because the mutant is required to make those siRNAs," he says. "So setting your genetics up to tell the difference between those two things is important."