KATE WHITMOREOn a sunny day near Perth, Australia, two-year-old Scarlett Whitmore stares intently at her left shoulder. With absolute concentration, she raises her head to look at her physical therapist, who is holding onto Scarlett’s arm to keep her steady. “I’m so proud of you,” her mother, Kate Whitmore, cheers as she films the session with her camera phone. Looking proud herself, Scarlett rolls onto her back, stretches out her arms and legs, and smiles broadly. Her smile is infectious. Her green eyes grow wide whenever she flashes her toothy grin—the inspiration for “Scarlett’s Smile,” the name of the foundation her parents started to raise money for Scarlett’s medical expenses.

Scarlett has poor hearing and vision and hasn’t learned to sit up on her own, stand, walk, or speak. And for the first year of her life, her parents had no idea why. Just after Scarlett was...

Months passed, and more tests came back negative. The Whitmores focused on early intervention therapies for Scarlett, trying to stay positive and enjoy spending time with their daughter. But whenever Scarlett cried, Kate agonized, not knowing if her daughter’s pain resulted from expected things, such as teething, or from her mysterious illness. “It was just eating me up not knowing what was wrong with her,” she says. While Scarlett slept, Kate researched her symptoms, which ranged from visual impairment to hypotonia (muscle weakness), trying desperately to figure out what was causing them in her child.

It was just eating me up not knowing what was wrong
with her.—Kate Whitmore

Finally, Kate came across information about an organization in Seattle, Washington, called MyGene2 that was offering to sequence and analyze the genomes of patients with undiagnosed diseases for about $700 per sample. Kate had heard of this type of sequencing before, but it was nearly impossible for the family to access it in Australia, and Scarlett’s geneticist had recommended against it—the approach is not routine in Australian hospitals, making it more expensive and the data more difficult to interpret. Nevertheless, just after Scarlett’s first birthday, the Whitmores sent saliva samples to MyGene2, where scientists sequenced each family member’s exome—the 1.5 percent of the genome that encodes proteins. Researchers at Washington University in Seattle then compared Scarlett’s exome sequence to databases containing thousands of sequences in search of a mutation that could explain her symptoms.

In January 2017, a verdict emerged: Scarlett had a rare mutation in the gene encoding G protein subunit beta 1 (GNB1), a component of a molecular switch protein complex known to regulate some neuronal functions. The Whitmores learned that Scarlett had not inherited the mutation from them, and that the disease will most likely spare her heart and lungs, giving them huge peace of mind. “This was worth its weight in gold,” says Kate. Along with just 30 other patients with GNB1 mutations worldwide, the Whitmores enrolled in a research study describing the mutation’s effects, and a paper reporting the findings is now being prepared for publication.

Just ten years ago, the Whitmores’ story would have been very different. Back then, sequencing and analyzing a single exome cost between $70,000 and $80,000 and took months to complete. These days, clinicians can easily order an exome sequence and analysis, and at a commercial cost of around $700-$5,000 the test has become widely available and is often covered by insurers. Organizations such as MyGene2 and larger, national organizations, such as the Centers for Mendelian Genomics (CMG) and the Undiagnosed Diseases Network (UDN), are using the approach to help diagnose rare diseases, and to end what clinicians call the “diagnostic odyssey” for hundreds of families every year. “Exome sequencing has really been revealing,” says Robert Kliegman, a neonatologist and rare disease specialist at Children’s Hospital of Wisconsin in Milwaukee.

Helpful as it’s been, however, exome sequencing only resolves 25 percent to 50 percent of undiagnosed cases. Researchers and clinicians are now exploring new tools, such as whole-genome sequencing and RNA analysis, developing better techniques to analyze sequence data, and finding ways to get patients with the same diseases connected faster. This effort is making rare disease diagnosis likely to experience another revolution in the next decade.

Exome explosion

About 15 years ago, Kliegman and his colleagues started noticing a huge unmet need at Children’s Hospital of Wisconsin. Families would end up there after years of searching for a diagnosis, and there was no system in place to settle their cases. Then chair of the pediatrics department at the Medical College of Wisconsin (MCW), Children’s Hospital’s academic partner, Kliegman began bringing together specialists to discuss undiagnosed cases in detail. But the team wasn’t galvanized until it came up against the case of Nic Volker, a young boy with severe inflammatory bowel disease. By the time Volker turned four, his intestines were dotted with holes, he’d had a colostomy, and he mainly ate through a feeding tube. The hospital’s gastrointestinal specialist couldn’t make sense of the disease, leaving Volker’s doctors with no options beyond treating his symptoms.

In 2009, at the request of Volker’s pediatrician, a team at MCW sequenced the boy’s exome. The $75,000 bill was covered by funds raised by Howard Jacob, the founding director of MCW’s genetics center, who hadn’t expected to implement exome sequencing there for at least another five years. Analysis of Volker’s genetic data picked up more than 16,000 gene variants, and four months of sifting through those variants revealed that a mutation in X-linked inhibitor of apoptosis protein (XIAP), a gene on the long arm of the X chromosome, was the likely culprit behind his illness.1 XIAP mutations were already associated with X-linked lymphoproliferative disease, an immunodeficiency disorder that leaves boys unable to fight off Epstein-Barr virus. Because the gene only affects immune cells, a cord blood transplant to replace Volker’s immune cell progenitors was enough to essentially cure him, says Kliegman. The case became nationally renowned as the first time DNA sequencing saved a patient’s life.

That was one of those
eureka moments.

—Robert Kliegman
Children's Hospital of Wisconsin

In a paper describing the research, the Wisconsin team noted that a thorough study of the available medical literature turned up a list of more than 2,000 gene variants that could have been responsible for Volker’s condition on the basis of his symptoms alone, and XIAP wasn’t on it. The boy’s case “was profound for all of the people in the hospital,” says Kliegman. “That was one of those eureka moments.” The experience led to a shift in the mindset of the hospital’s board, and now genetic sequencing is a cornerstone of the center’s diagnostic approach. By 2014, the MCW’s Human and Molecular Genetics Center (now the Genomic Sciences and Precision Medicine Center) was sequencing more than 700 patients per year.

Even as that project was getting started, other research teams were already putting together studies about how the approach could help diagnose rare genetic conditions on a larger scale. In 2009 and 2010, for example, a team led by geneticists at the University of Washington in Seattle demonstrated that exome sequence analysis alone could reveal disease-causing mutations, first in a group of people with a known disease2 and then in patients with undiagnosed diseases.3

Meanwhile, Duke University geneticists Vandana Shashi and David Goldstein were working to answer the practical question of how often exome analysis could be expected to provide a diagnosis. Goldstein recalls thinking that “if it resolves even just one out of ten of these really difficult cases, that’d be a remarkable new contribution.” The team enrolled 12 undiagnosed patients—all with different symptoms—into a pilot program at Duke, and identified disease-causing gene variants in the exomes of six. This success rate, combined with the steadily dropping costs of DNA sequencing, made it clear that exome sequencing would be a cost-effective way to end the frustration experienced by so many clinical geneticists and families searching for a answers, Shashi says. “It’s energized people like me.”

Although for most cases, a whole-exome sequencing diagnosis doesn’t lead to a cure as it did for Nic Volker, it usually opens a treatment path, says Kliegman. For example, many diseases historically known as “seizure disorders” now have names and mutations associated with them, allowing doctors to use targeted drugs “rather than shooting an ant with an elephant gun,” he says. In one case Kliegman worked on, the diagnosis delivered by exome sequencing made his patient eligible for deep brain stimulation. “That’s rewarding,” he says. “But we don’t find that in everyone.” He notes that whatever the outcome, however, families are always glad to have some kind of answer.

Find the variant

A key factor in propelling exome sequencing into clinical diagnostics is the recent expansion of genome databases. In 2009, researchers working with Nic Volker’s sequence had to ask other scientists for access to sequences to compare against his. Now, accessing thousands of samples is simple thanks to efforts such as the 1000 Genomes Project, which ran from 2008 to 2015, and the Exome Aggregation Consortium (ExAC), launched in 2014 by researchers at the Broad Institute in Boston. With 60,706 exome sequences deposited by more than 100 research projects mainly being run at the Broad, ExAC was the most comprehensive exome database at the time of its release. Its successor, the Genome Aggregation Database (gnomAD), already contains 123,136 exome sequences and 15,496 whole-genome sequences.

Such databases are vital because the diagnostic power of exome sequencing depends on clinicians’ ability to sift through variants and locate the pathogenic ones. “We all have thousands of rare variants, and most of them are completely benign,” says ExAC team member Anne O’Donnell-Luria, a geneticist and associate director of the Broad Institute’s Center for Mendelian Genomics (CMG), one of four centers funded by the US National Human Genome Research Institute (NHGRI) to pinpoint causative mutations for genetic diseases. ExAC and gnomAD contain data from individuals who are not known to be affected by severe pediatric disease, making them particularly handy for diagnosing children, such as Scarlett Whitmore, who have very rare genetic diseases. “We know what variants occur in the human population, and we can toss all those out,” Goldstein explains. “We can really narrow in on the candidates pretty quickly and effectively now.”

LOOKING FOR CLUES: Sequencing of Scarlett Whitmore’s exome identified a rare mutation that could explain her symptoms.KATE WHITMORE

There are also several searchable online databases, such as the National Institutes of Health (NIH) archive ClinVar and the Wellcome Sanger Institute’s DECIPHER, which both contain variants and the phenotypes associated with them. Canada will soon have its own repository of rare disease variant data and clinical phenotypes called Genomics4RD.

If a patient’s variant appears in any of these databases, a diagnosis is on its way. Otherwise, researchers have to dig further to find out if a variant of unknown significance (VUS) is pathogenic. One approach is to try to predict how a variant impacts the function of the protein coded by the gene containing it. The recently developed Model Organism Aggregated Resources for Rare Variant Exploration (MARRVEL) database integrates information from other repositories with data from animal models of specific variants.5 And MCW’s Genomic Sciences and Precision Medicine Center (GSPMC) often performs molecular dynamics simulations to visualize how variants cause proteins to move differently in three dimensions.

Statistical analyses can also help determine pathogenicity. ExAC’s size has made it possible for researchers at the Broad Institute to develop constraint scores for genes by comparing how often one type of variant is expected to appear in a gene, and how often it actually shows up. For example, if the algorithm predicts that a certain gene should occur with a loss-of-function variant 20 times in a population of 60,000, but that gene never shows up with such variants, the gene is assigned a high score. The higher the score, the more likely that mutation-gene combination is to cause disease.

Researchers and clinical labs use multiple different tools to get detailed information about variants of interest, and more are being developed all the time (see “The Genetic Components of Rare Diseases,” The Scientist, July 2016). “Everyone tries everything they can to solve cases,” says O’Donnell-Luria. Even then, though, an analysis may fail to return a verdict. The report may come back with no candidates, or with one or more VUS. “There’s a fraction of cases where you have to work harder to determine whether you have a diagnosis,” says Goldstein. O’Donnell-Luria says that when one of her own patients gets back a negative clinical exome sequence report, she will offer to enroll him in a study and reanalyze his data using newly developed programs. She may also encourage families to apply for free exome sequencing and analysis by the CMG.

Everyone tries everything they can to solve cases.—Anne O’Donnell-Luria
Center for Mendelian Genomics, Broad Institute

Occasionally, a case may call for whole-genome sequencing, which can reveal pathogenic mutations in noncoding regions of the genome, such as those that affect transcription. Beyond that, RNA sequencing may be performed to search for things like splice variants. MCW’s GSPMC is also optimizing methods to analyze DNA methylation patterns and may even go so far as to reproduce a VUS in a zebrafish or mouse model to determine its effect.

That said, sometimes getting an answer simply requires rechecking variant databases over time. “You may find that today it’s a VUS, and tomorrow someone reports another child, and then bingo,” says Kliegman. Scarlett Whitmore’s variant, for example, was caught during a second analysis because it took time for data from a recent study by Goldstein and others that described 13 other cases to make it into the variant databases.6 The two analyses were done only a few weeks apart. What makes all the difference in solving such cases is not just the available technology, Kliegman says, but a thorough and team-based approach. For each case, his team collects all of the patient’s primary medical test results and reanalyzes every detail. “We don’t ignore anything,” he says.

In 2008, the NIH initiated the Undiagnosed Diseases Program (UDP) to employ this kind of thorough workup to make diagnoses and improve research on rare diseases. In six years, the UDP received more than 10,000 inquiries.7 In 2014, the program expanded into the UDN, which includes seven academic medical centers across the U.S. that are funded to pay for patients’ travel expenses, run tests (including exome and RNA sequencing), and to gather a diverse panel of specialists to work on cases. Bret Bostwick, a clinical geneticist at Baylor College of Medicine in Houston, one of the UDN sites, says that most families accepted into the UDN’s program have seen dozens of specialists over the years, but have never before had a team work in one concerted effort on their behalf. “Families really find a niche they’ve been looking for,” he says. Of the 685 patients evaluated since 2014, 177 have received a diagnosis.

Matching Game

The UDN’s job doesn’t end at identifying a potentially pathogenic variant. “Just having a gene discovery doesn’t help anybody,” Bostwick says. “We also need to make sure that when we discover a new gene we take the time to gather patients who have the disease and study them so that they can teach [us] about what the gene does. That in turn teaches us how to treat them.”

When he and his colleagues determined that a recent patient of theirs carried a mutation in the cell cycle control gene CDK13, Bostwick searched for published case reports on the gene, and called up diagnostic laboratories to ask if they had sequenced others with the same mutation. But he made more headway by entering the patient’s information into a network called Matchmaker Exchange, which links several variant-phenotype databases and networks to help researchers and clinicians find multiple individuals with the same variant. In two months, the network connected Bostwick to eight other patients with the same CDK13 variant. “I could spend a lifetime, and I would never have found another patient by myself who has this gene change,” he says. Thanks to this accessible cohort, Bostwick and his colleagues were able to publish an updated description of the disease, which could help with future diagnoses.8

This process, however, is not always so expedient, and many patients wait years for a diagnosis because the clinical literature or variant databases haven’t yet caught up with their disease. “One of the major barriers right now to new gene discovery, and to how to use that information clinically, is data sharing,” says Michael Bamshad, a clinical geneticist at the University of Washington in Seattle who helped lead the institution’s early exome sequencing work and co-runs its CMG. Soon after the CMG launched in 2011, Bamshad says, it became clear that researchers were identifying gene variants faster than they could publish on them. “In essence, we were sitting on hundreds of discoveries, and any one of those discoveries could be useful for a family,” he says. The CMG initiated a database called GeneMatcher to link researchers and clinicians interested in the same genes, but Bamshad says this didn’t ease his frustration. “If two researchers shared data and made a match, neither were under the obligation to contact one another, much less to put together a manuscript to publicize the discovery,” he says.

For Bamshad and CMG colleague Jessica Chong, this frustration came to a head when they were contacted by a couple who had created a website and Facebook page to connect with families whose children had the same VUS as their son. “Gene discovery by social networking,” Bamshad called it. After connecting to another family, the couple needed help to get access to the family’s data, and Bamshad and Chong ended up coauthoring a report on the gene with the couple. The experience made the researchers wonder if something could be done to help families share their data on their own. “That was the birth of MyGene2,” says Bamshad.

MyGene2, which currently holds over 1,200 profiles, allows families to upload as much or as little information about their family member’s undiagnosed disease as they like. That may include medical records, annotated gene information from clinical exome sequencing reports, or actual exome or genome sequences. Doctors and researchers may also upload de-identified data from patients or participants in research studies. The platform automatically matches and notifies individuals who report the same variants, allowing families to utilize data that might otherwise stagnate. “This data is much more valuable if shared publicly,” says Chong. The system not only aids diagnosis, it also helps families learn about the details and prognosis of their family member’s condition.

That couldn’t be truer for Kate Whitmore. Being in touch with other families whose children also have GNB1 mutations has helped the Whitmores learn how best to manage Scarlett’s disorder, what sorts of symptoms to expect, and which therapies to try. “I don’t worry so much, I don’t second-guess everything,” says Kate. These days, the Whitmores are freer to simply enjoy being with their daughter. “Scarlett’s a beautiful, happy little girl. She’s worth all the effort and then a hundred times more.”

Amanda B. Keener is a freelance science journalist living in Denver, CO.

How rare is rare enough?

Although they are uncommon by definition, rare diseases affect around 350 million people worldwide in total. “The magnitude is much bigger than what is perceived,” says Duke University clinical geneticist Vandana Shashi.

Nonprofit organizations such as Global Genes and the National Organization for Rare Disorders often report that there are about 7,000 known rare diseases. But the precise definition of “rare” may vary depending on who you ask. In the European Union (E.U.), a disease is “rare” if it affects fewer than 5 in 10,000 people. The World Health Organization has defined rare diseases as those affecting “less than 6.5–10 people in 10,000.” Meanwhile China’s official definition, which remains controversial, is a disease affecting one person in 500,000 or one newborn in 10,000.9 (See “Rare Disease: By the Numbers” here.)

One recent survey performed by researchers at the Canadian Agency for Drugs and Technologies in Health in Ontario found that, across 1,109 organizations worldwide, there were 296 different definitions for rare diseases and thresholds for orphan drugs, ranging from 5 cases to 76 cases per 100,000.10 The average definition was 40 cases per 100,000 people, close to the E.U.’s definition.

These definitions matter to patients and their families waiting for drugs to be developed to treat rare diseases. In the U.S., for example, the Orphan Drug Act grants pharmaceutical companies various incentives, including tax cuts, for developing drugs meant to treat diseases that affect fewer than 200,000 Americans at any given time. The US Food and Drug Administration’s Office of Orphan Products Development funds several grants and for companies working on drugs or medical devices that can benefit patients with rare diseases. The European Medicines Agency offers similar incentives, such as reduced fees and market exclusivity for drugs developed for diseases that meet the E.U.’s definition of rare.


  1. E.A. Worthey et al., “Making a definitive diagnosis: Successful clinical application of whole exome sequencing in a child with intractable inflammatory bowel disease,” Genetics in Medicine, 13:255-62, 2011.
  2. S.B. Ng et al., “Targeted capture and massively parallel sequencing of 12 human exomes,” Nature, 461:272-76, 2009.
  3. S.B. Ng et al., “Exome sequencing identifies the cause of a mendelian disorder,” Nat Genetics, 42:30-35, 2010.
  4. A.C. Need et al., “Clinical application of exome sequencing in undiagnosed genetic conditions,” J Medical Genetics, 49:353-61, 2012.
  5. J. Wang et al., “MARRVEL: Integration of human and model organism genetic resources to facilitate functional annotation of the human genome,” Am J Hum Genet, 100:843-53, 2017.
  6. S. Petrovski et al., “Germline de novo mutations in GNB1 cause severe neurodevelopmental disability, hypotonia, and seizures,” Am J Hum Genet, 98:1001-10, 2016.
  7. C.J. Tifft and D.R. Adams, “The National Institutes of Health undiagnosed diseases program,” Curr Opinion Pediatrics, 26:626-33, 2014.
  8. B.L. Bostwick et al., “Phenotypic and molecular characterisation of CDK13-related congenital heart defects, dysmorphic facial features and intellectual developmental disorders,” Genome Medicine, 9:73, 2017.
  9. Y. Cui, J. Han, “Defining rare diseases in China,” Intractable and Rare Diseases Research, 6:148-49, 2017.
  10. T. Richter et al., “Rare disease terminology and definitions—A systematic global review: Report of the ISPOR Rare Disease Special Interest Group,” Value in Health, 18:906-14, 2015.

Correction (May 9): The original version of this article incorrectly stated that Anne O’Donnell-Luria was a cofounder of ExAC, a database that contains data from individuals over 18 years old. The wording has been modified to reflect the fact that O’Donnell-Luria is only a part of the current ExAC team, and this database (along with its successor gnomAD) contains mostly, but not exclusively, data from over-18-year-olds. The Scientist regrets the error.

Correction (May 14): An earlier version of this article referred to “Washington University in Seattle”. The text has been updated with the institution's official name, to read “the University of Washington in Seattle.” The Scientist regrets the error.

Interested in reading more?

Magaizne Cover

Become a Member of

Receive full access to digital editions of The Scientist, as well as TS Digest, feature stories, more than 35 years of archives, and much more!
Already a member?