A Mind Apart

Sean Eddy used his decades' experience playing video games to design software that found an entire new class of genes. And he's still looking.

Karen Hopkin
May 31, 2008
<figcaption> Credit: © Stephen Voss</figcaption>
Credit: © Stephen Voss

Sean Eddy learned to write computer programs by playing Empire, a game he describes as a "fanatically detailed, multiplayer world simulation." As leaders of a country called Mirkwood, Eddy and his buddy Tom Jones dominated the Empire world in the late 1980s and early 1990s, winning battle after battle, including the first world tournament and the longest-running game in Empire history, which lasted nearly eight months.

Although play would occasionally take place at the expense of eating, sleeping, and doing his thesis research, Eddy says that Empire taught him the programming basics he would later parlay into software packages for identifying protein superfamilies and searching genomes for noncoding RNAs. "I wrote software to automate most of our economic and military operations in the game," says Eddy. He later recycled those same algorithms in his sequence alignment and RNA folding programs.

"Sean is generous with his intellectual...

"Sean was big into gaming, which is the best training you can get as a programmer," says MIT's Robin Dowell, who did her graduate work with Eddy. As a result, Eddy's algorithms are elegant, efficient, and "exceptionally popular," says Alex Bateman, who runs the protein family (Pfam) database at the Wellcome Trust Sanger Institute. "Pfam completely relies on Sean's software" to search for the sequence homologies that allow proteins to be classified into distinct families, he says.

Eddy has assembled similar software for cataloging RNAs, including algorithms that have allowed him to search whole genomes and identify hundreds of new RNA genes. "These programs are incredibly complicated; they give me a headache just trying to think how they work," says Bateman. "That's why I'm really glad someone like Sean is doing this. He writes very strong code that you can really rely on."

He's also happy to share. Not only is all his software open source, Eddy is quick to apply it to all sorts of problems. "If you send a paper to Sean for an opinion, he'll say, 'Your analysis didn't look right to me. So I ran this bit of computer code and came up with something different,'" says Gerry Rubin, director of Howard Hughes Medical Institute's Janelia Farm, who recruited Eddy as a group leader in 2006. "Sean is generous with his intellectual expertise, and he personifies the scientific ideal of seeking the truth. He's the kind of person you want to have around in the lab next door."

The Basics

Before he was in the lab next door, Eddy was the typical boy next door. He grew up on farmland in western Pennsylvania, where he explored the local creek and raised frogs and spiders. As an undergraduate, Eddy went to California Institute of Technology (Caltech) to try his hand at physics, but he found that the math was too difficult. "They had a remedial class for the challenged people who had not had the proper training, but even that was too hard for me," laughs Eddy. "The only bit I understood was probability theory, because I played poker and blackjack. Everything else was over my head."

But he did enjoy the bench work. So in 1986, with a bachelor's degree in biology, Eddy left to pursue his PhD at the University of Colorado at Boulder. Tom Cech had just discovered catalytic group I introns, and "Boulder was just a phenomenal place to do RNA research," Eddy says. For his thesis work, Eddy studied the movement of parasitic introns in bacteriophage T4. More importantly, he became intrigued with RNA and the role it might have played in life's origins, when "the earliest creatures replicating in the primordial soup were made of RNA." Remnants of that RNA world, such as catalytic RNAs, can be found in modern genomes. So Eddy decided to write programs to search for these molecular fossils. The problem? "The class of algorithms needed to do that was not known to computational biologists at the time," he says. Nor were they known to Eddy, who spent about six months messing around with some code before deciding to move on to a new project, in a new place.

That move took him to England, where he joined the lab of John Sulston at the MRC Laboratory of Molecular Biology in 1992 (click here for a related story). There he took on the project of determining how growing neurons find their targets. "What I needed was a way to visualize axons in a living animal," says Eddy, who was working on Caenorhabditis elegans at the time. "And I thought I could do it with luciferase, the firefly protein that makes light in combination with luciferin and ATP." But luciferase was a dead end. And not long after Eddy got his hands on some green fluorescent protein (GFP), he was scooped: Martin Chalfie of Columbia University had published the paper that launched the GFP era when he performed exactly the same studies that Eddy had planned to do. "That sort of submarined my project," he says.

Luckily, Eddy had a backup plan. As a side project, Eddy - in collaboration with his other postdoctoral mentor, Sanger's Richard Durbin - had been tinkering with the algorithms that would come to form the basis for his protein homology search software, the HMMER package. Those programs were working well, but RNAs are harder to handle than proteins because they form secondary structures based on base pairing. For example, in two related RNA sequences, the identity of one nucleotide might change over evolutionary time - say from a G in the first sequence to a U in the second. But if that nucleotide had been involved in forming a secondary structure, the base it pairs with will also change, from a C in the first sequence to an A in the second. "So to make a good computational model for RNA, you need a program that can deal with the fact that you have these base-pairing relationships in the RNA," says Eddy. "That was the algorithm that had been missing."

Mining For More

And in 1994, that's the algorithm he found. The RNA homology software is now called Infernal, a name Eddy chose, says Bateman, "because it's one of the few words in the English language that has 'RNA' in it." The program allows users to search through genome sequences for RNAs that have conserved secondary structures. Running Infernal or similar RNA-seeking software is now a standard way to scan a genome for sequences with conserved structures, such as tRNAs. "But at the time nobody was doing this stuff, looking at RNA in a computational way," says Robert Waterston of the University of Washington, who hired Eddy when he was at Washington University in St. Louis in 1995. "A lot of the biological community thought sequencing shouldn't be done at all. Sean was less worried about that than about figuring out how to use the information."

Now Eddy is trying to use his programs to do more interesting things, such as finding whole new classes of undiscovered RNAs, including those remnants of the RNA world. "This was pre-microRNAs, so it was a bit of a conspiracy theory," says Eddy. "The idea that we could have missed hundreds of genes, whole classes of genes, was just considered stupid."

But find them, he did. In 1999, Eddy and his first graduate student, Todd Lowe, discovered dozens of small, nucleolar RNAs (snoRNAs) in the yeast genome. They published the work in Science. "Before the days of RNAi and microRNAs, these snoRNAs were a really big deal," says Lowe. They found even more of them in archaeal genomes, work that garnered them a second Science paper in 2000. Eddy and his wife, Elena Rivas, have also predicted hundreds of new regulatory RNAs in Escherichia coli.

He's not yet satisfied. "The original goal in developing this algorithm was to look for catalytic RNAs and things in the RNA world," Eddy says. "Instead, I've ended up with a bunch of regulatory RNAs in E. coli that are working by very stupid mechanisms," basically base-pairing with other RNAs. "I still would like to think there are catalytic RNAs lurking out there undiscovered."

A Mind Apart

If there are, Eddy will likely find them. "Sean is enormously imaginative and has a lot of very nice ideas, some almost bordering on science fiction," says Graeme Mitchison of the MRC Laboratory of Molecular Biology. "But he's also a complete realist and is ready to think of experiments and ingenious ways to test those theories."

In the meantime, he continues to improve on his existing software, making the programs run faster and use less memory. "There are a lot of tricks to make your software look better," says Bateman. "Sean doesn't make his software look better. He actually makes it better. Which is exactly what you want." Although he doesn't always document those steps in peer-reviewed publications. "It's amazing, but Sean has never published a paper on the HMMER software," says Bateman, "which shows that Sean cares more about producing a software package that can be used than he does about adding another career-advancing paper to his name."

The same attitude is reflected in how Eddy handled his students' publications. "When we were writing up the snoRNA paper, he was getting a lot of pressure from other faculty to get it finished," says Lowe. "But he was very patient with me. He didn't rip it away from me and finish it himself. He knew it was a learning experience. And he did what was in the best interest of his student and not in his own best interest."

Now, at Janelia Farm, Eddy will be able to return to doing the work he loves, whether that's digging for RNA fossils or forging new ground in computational neurobiology. With a limit of six people to a lab group, and no need to write grants, Eddy will have "more time to focus on the science, which is really where his heart is," says Dowell.

"For a PI at his level, Sean did a hell of a lot of work himself," says Barak Cohen, a former colleague at Washington University. "A lot of the fire in his lab came directly from him, straight from his brain through his own work and the work he handed down to other people. It's rare to see that in PIs of his level - getting his hands dirty and slogging it out. But I think he wanted to do more of that."

For that, the move to Janelia is just about the perfect thing for Eddy. "The perfect thing for Sean would be to give him a large pile of cash and his own island," says Cohen. "Just put him on the island and let him do his thing. Of course, the island would have to have good coffee. And an Internet connection. But I think science would benefit if we could just do that."