Cindy Magee
The Biopython project
Like its sister projects (BioPerl, BioJava, and BioRuby) at OBF, Biopython includes code to manipulate and annotate sequences, communicate with remote databases, parse file formats, and execute common bioinformatics programs. But it also sports some unique features, according to project coordinator and University of Georgia grad student Brad Chapman, including code for molecular modeling and protein structures, a clustering library, and a parser-generator library named Martel.
Martel allows developers to build new parsers by defining the file format with a series of "regular expressions on steroids," says Chapman. The actual parsing process is done with an underlying layer of C code, making the parsers highly scalable. The...