Gene Expression Data Mining

Image: Courtesy of Rosetta Biosoftware The Rosetta Resolver system's Image Viewer application showing an Affymetrix GeneChip probe array. There is a saying: "Be careful what you wish for, you just may get it." Biologists long pined for faster, more efficient ways to gather data; now they generate genomic information faster than they can assimilate it. The result: information overload. The solution: data mining. Though data mining is an ambiguous term, most definitions include the idea

Written byGail Dutton

| 9 min read

Save for Later

Listen with Speechify

0:00

9:00

There is a saying: "Be careful what you wish for, you just may get it." Biologists long pined for faster, more efficient ways to gather data; now they generate genomic information faster than they can assimilate it. The result: information overload. The solution: data mining.

Though data mining is an ambiguous term, most definitions include the idea of dealing with very large data sets and enabling exploratory data analysis, says Simon Lin, manager, Bioinformatics Core Facility, Duke University. That approach is handy when you're not sure what you're looking for. Traditional analysis, in contrast, tests a hypothesis.

"With data mining," Lin says, "you're always getting something unexpected." He cites a clinical collaboration in which a second, heretofore unknown, disease subtype was found, helping to explain why the standard treatment failed for some patients. "We're using data mining to generate more hypotheses, not to confirm therapies," he emphasizes. In another collaboration, ...

Interested in reading more?

Become a Member of

Receive full access to digital editions of The Scientist, as well as TS Digest, feature stories, more than 35 years of archives, and much more!

Join for free today

Already a member? Login Here