Disentangling the good from the bad--gene and protein data, that is--may be the toughest task for today's bioinformatics scientists assembling new models for proteins, says Greg Paris, executive director of biomolecular structure and computing at Novartis Pharma Research in Summit, N.J. "One of the major advances has been the speed with which new genomes can be characterized and at least partially annotated, and there are very good gene-finding tools that help in this endeavor," he explains. "The minus is that the annotation process strongly relies on prior annotation, so that even with the most careful attention, it's possible to propagate low-probability guesses as though they were high-probability facts. This means that downstream, disentangling the quality of the evidence [from the quantities] is quite problematic. From a pharmaceutical perspective, a double-edged positive/negative is the vast quantity of data that is now available. It presents extreme challenges for data mining."


Interested in reading more?

Become a Member of

Receive full access to digital editions of The Scientist, as well as TS Digest, feature stories, more than 35 years of archives, and much more!