Not long after Rob Knight started his first lab at the University of Colorado Boulder in 2004, one of his graduate students, Catherine Lozupone, approached him and asked to make a change. Knight’s lab was studying an appendage called the Type III secretion system, which some Gram-negative bacteria use to detect and infect eukaryotes. Lozupone told Knight she didn’t want to study the secretion system any more. Instead, she said, she wanted to study microbial diversity using computational methods.
“We had had a conversation about this when she first joined the lab, and I had said to her, ‘You know Cathy, I don’t want to stop you from following your dreams, but at the same time, Type III secretions is a hot area of science. And as far as the microbial diversity stuff, even MacArthur genius award winner and National Academy member Norman Pace can’t get funded to do microbial diversity work,’” Knight recalls. He explained he would feel bad if, when she finished her PhD, she found herself working in a “backwater field that no one cares about while the Type III system is on the cover of Science.”
Lozupone persisted, and in 2005 she and Knight published a paper describing UniFrac, a computational tool to identify differences in the compositions of microbial communities using phylogenetics. After the UniFrac publication, Knight gradually shifted the focus of his lab toward studying the complexities of microbial ecosystems. And the Type III secretion system? “We never published anything on that. Cathy’s work then became the basis for QIIME, [a bioinformatics tool that is] now the most widely used way to analyze high-throughput microbial community sequencing data,” Knight says.
Because of that work and the work he’s done since, Knight is among the leaders in microbiome research—cataloging the bacterial and other microscopic residents of the human body, in and on other animals, and in environments from people’s kitchens to Arctic soil. He and colleagues use their data to explore how bacterial communities affect human health and ecological environments.
Carving a path
Knight was born in 1976, the oldest of three boys. He grew up in Dunedin, on the South Island of New Zealand, and his parents, John and Alison, were both immunology researchers at the nearby University of Otago.
Knight enjoyed being outdoors, exploring the Otago Peninsula and the coast. He liked reading science fiction, was interested in fossils, and as a seven-year-old, built dinosaur models out of papier-mâché. Knight was also drawn to computers from an early age. In 1994, he enrolled in the University of Otago, where he studied biochemistry—an interest sparked by his high school biology and chemistry classes—and thought he would become a chemical engineer. He doubled up on courses and graduated in just two and a half years. “I thought of classes as basically a distraction [from] the research I wanted to do,” Knight says.
Guided by the scientific papers he read on the subject, Knight did his own independent population modeling as an undergraduate and wanted to apply this concept to control pests—rabbits and other non-native mammals—introduced to New Zealand. In 1995, he received a grant and traveled to Princeton University in New Jersey to work in Lee Silver’s molecular biology lab to use genetics, specifically gene drives, to control pest populations. However, Silver ended up closing his wet lab to focus on policy and bioethics.
Knight was accepted as a graduate student at Princeton in 1997 and had to find a new laboratory to join. He chose Laura Landweber’s lab because he was drawn to her studies on evolution, and he intended to do experimental biology studies, which he spent most of his time on. But he wound up expanding his programming skills and publishing far more results using computational approaches. Others in the lab also influenced him, including then-postdoc Stephen Freeland, now a professor at the University of Maryland, who taught Knight the value of using computational approaches for evolutionary questions.
Knight decided to focus on DNA, starting with an unexplained observation made by early geneticists: guanine-cytosine (G-C) nucleotide content correlates with the usage of certain amino acid codons across the genome. Knight developed a program to analyze the protein-encoding genes of 596 bacterial, archaeal, and eukaryotic genomes, creating a simple model using linear regression and based on mutation and selection rates. The work showed that it was the G-C content of a genome that drives the use of certain codons, which helped explain why different organisms prefer different codon sequences to encode the same amino acid. For example, the amino acid phenylalanine can be encoded by two different codons—UUU or UUC. Species with genomes having a high G-C content will prefer the UUC codon and other codons with guanine and cytosine. “I was able to use mathematical modeling to understand a big mystery of how the genetic code is used across different species,” Knight says.
After graduate school, Knight joined Michael Yarus’s lab at CU Boulder as a postdoc to work on RNA sequences. He was particularly interested in identifying computational approaches that could provide insight into how an RNA sequence element—a specific sequence that is found within RNA—confers a function, such as binding of a protein. Identifying such biologically functional sequences within genes that code for RNA and understanding the secondary structure of an RNA molecule once it’s transcribed helps researchers understand its behavior and binding partners.
Knight analyzed the relationship between an RNA’s sequence element and its secondary structure and function, revealing recurring sequences called motifs, which could be used to identify DNA sequences that encode RNA molecules and to predict the RNAs’ secondary structure. That’s not an easy task. “The trick was that if you want to identify an RNA sequence with a secondary structure, it would take way too long to randomly generate sequences, fold them, and then check if they have the right structure,” Knight explains. “But, if you calculate the probability that you have the right RNA structure based on an RNA sequence element, this saves orders of magnitude in computation time to make these RNA searches manageable.” Even then, it took a supercomputer to run the calculations.
On to microbes
Knight finished his postdoc in 2004 and started his own lab at Boulder that same year. It didn’t take too long for Lozupone, who now runs her own lab at CU Denver, to push for a different direction and to start studying microbial diversity, and for Knight to begin collaborating with Norman Pace. Pace’s lab was doing microbiome studies, sequencing the ribosomal RNA (rRNA) genes from bacteria found in hospitals and other constructed environments. These rRNA genes are essential for protein synthesis in all living organisms and are therefore highly conserved. To identify bacterial species using DNA sequencing and to reveal the evolutionary relationships among species, researchers had traditionally homed in on rRNA genes. Lozupone and Knight realized that biologists lacked tools to analyze the large amount of sequencing data that these microbiome studies created. So they created UniFrac, software that constructs evolutionary trees from bacterial sequencing data and identifies the parts of the phylogenetic trees that microbes have in common and those that are unique, rather than simply counting the number of species present in each sample. The paper describing the software was published in 2005 and has been cited more than 4,000 times.
Knight’s next step was to use the software in a collaborative study with Jeffrey Gordon, a gastroenterologist at Washington University School of Medicine in St. Louis who explores the relationships between beneficial gut microbes and health. The researchers sequenced the microbes found in the intestines of four groups of mice—obese mice with two copies of a mutation that confers excessive overeating; their siblings that carry one copy of this mutation; normal weight, wild-type siblings; and the mothers that also carry one mutation copy. All the mice were fed the same diet. UniFrac analyzed the rRNA gene sequences and showed that the obese mice had 50 percent fewer of one bacterial group and 50 percent more of another group than healthy mice, suggesting that obesity affects the diversity of the gut microbiota. Still, “no one really cared about microbiomes at the time. I thought of this still as just an interesting side project,” Knight says.
The ability to classify an obese body type using the microbiome but not genetics suggests the importance of environmental rather than genetic causes of obesity.
He continued to collaborate with Gordon on the link between obesity and the gut microbiome. In 2009, the team studied human lean-obese twin pairs, and found that leanness and obesity are associated with distinct microbiomes. About a year later, Knight’s lab released QIIME (Quantitative Insights Into Microbial Ecology), a software tool that allows users to analyze an enormous amount of raw sequencing data, compare sequences, create phylogenetic trees, and generate visual plots and graphs of the analyses. In that same year, the field of microbiome research began to take off, with Gordon’s and Knight’s labs at the forefront. In 2013, the team published a study in Science showing that transplanting human fecal microbiota from either a lean or an obese human twin into germ-free mice transferred the corresponding body phenotype to those mice.
Using a more advanced machine learning technique that Knight’s lab developed in 2011 now allows researchers to predict whether a person is lean or obese with about 90 percent accuracy based only on her microbiome. Using genetics alone results in a relatively inaccurate prediction of a person’s body composition; it has a 57 percent accuracy rate, a prior study found. “The ability to classify an obese body type using the microbiome but not genetics suggests the importance of environmental rather than genetic causes of obesity,” Knight says.
Knight and his colleagues have also studied connections between microbiota and diet, as well as the microbiomes of human households. Sampling feces from 33 mammal species and 18 humans who kept tabs on what they ate, the lab showed that humans’ and other animals’ microbiota adapt to diet similarly, differing broadly by function between plant and meat eaters. Sampling the homes of seven families, including three that moved house during the study, the lab, along with Jack Gilbert, now at the University of California, San Diego, found that each family home had a distinct microbiota and that the composition of the home microbiome was largely dictated by the humans who lived there. Even when people moved, the microbiota of their new home quickly began to look like that of their prior residence.
“We’ve gone from analyzing single sample data from people who are either healthy or sick to spatially resolved studies in which we can track people over time, looking at hundreds of samples,” Knight says.
In 2015, he moved his lab to UC San Diego, and began to focus on more all-encompassing projects, such as the American Gut Project and the Earth Microbiome Project. In 2016, the American Gut Consortium published a study showing that migraine sufferers had a significantly higher abundance of genes associated with nitric oxide production from anaerobic bacteria in their mouths compared with non–migraine sufferers. Nitric oxide has been associated with headaches. And with the Earth Microbiome Project Consortium, Knight and colleagues published a large meta-analysis of almost 28,000 samples and identified about 300,000 unique microbial 16S rRNA sequences from around the world.
As much as Knight enjoys asking biology questions, he says he also appreciates the time he spends working with mathematical models and computer programming, and loves teaching all of these skills. “It’s satisfying to write code that gives reproducible results and to create software in a way that biologists and not just computer scientists can understand it,” he says. “I’m grateful that Mike [Yarus] encouraged me to train and supervise students so that I was always running a computational lab.”
Even though Knight has had very little formal training in computer science, he now runs a mostly computational lab. And the irony is not lost on him. “The only computer science class I ever took was my first year in college. And now I am a professor in pediatrics, computer science, and bioengineering,” he says.
But teaching and research aren’t Knight’s only focuses. He says he also wants to engage the public in his science. He has written two general-interest books about the human microbiota. One, Follow Your Gut, is a survey of the knowledge researchers have gathered on how antibiotics, diet, living environment, and other factors affect our microbiome and our health. The other, Dirt is Good, coauthored with Jack Gilbert, provides advice for parents rooted in microbiome research. Knight says the books have spurred readers to ask many questions, and the one he most frequently gets is whether vaccines affect children’s microbiomes. “My answer is that there is overwhelming evidence that vaccines are good for keeping your kid alive,” he says. “Antibiotics, on the other hand, will have major effect on their microbes.”