Unstable Protein Variants Linked to Many Human Diseases

The Human Domainome 1—the largest library of human protein variants—reveals the cause of certain genetic disorders, paving the way for personalized medicines.

Sahana Sitaraman, PhD
| 4 min read
An illustration showing diverse protein domains, represented as squiggly gray structures, on a black background.
Register for free to listen to this article
Listen with Speechify
0:00
4:00
Share

Every human is born with 50 to 100 genetic variants that were not present in their parents.1 Based on this rate of mutation and the current size of the human population, researchers estimate that nearly all of the approximately nine billion possible single nucleotide changes that are compatible with life are already present in the world’s population. Genetic variants that change a protein’s amino acid sequence, producing protein variants, cause a third of all known human genetic diseases.2,3 Yet, scientists know very little about how most of these single nucleotide mutations affect protein function.

Now, researchers have created the Human Domainome 1 library, which contains more than half a million missense variants.4 Using this reference dataset, the largest of its kind, the team analyzed the impact of protein domain variants on protein stability. Their findings, published in Nature, could help scientists better understand how disease-causing missense variants change protein function and pave the way for personalized treatment of human genetic disorders.

An image of Antoni Beltran, a molecular biologist at the Center for Genomic Regulation.

Antoni Beltran, a molecular biologist at the Center for Genomic Regulation, aims to understand if it’s possible to predict the function of proteins from their sequence alone.

Centro de Regulación Genómica

“We measured every possible mutation in these [500 or so] protein [domains],” said Antoni Beltran, a molecular biologist at the Center for Genomic Regulation, and study coauthor. “So, the data is very rich and really good for prediction.”

Noting the value of the large dataset and its potential ripple effects on patient care, Frederick Roth, a molecular geneticist at the University of Toronto, who was not involved in the study, said, “This is getting ahead of the game. Rather than waiting till you see a variant in a sick person for the first time and doing the assay—often months or years later—you're doing them all in advance.”

Most human proteins are made of small, independently folded units called domains that impart diverse functions to the molecules. Since proteins are large, and therefore challenging to experimentally clone and study, Beltran and his colleagues focused on domains from 127 different families. To create the Human Domainome 1 library, the team methodically mutated each amino acid to all 19 alternatives at every position, generating more than 500,000 variants across 522 protein domains.

“The goal of testing all variants in the human genome [present] in functional regions [of proteins] is staggering,” Roth said. “But I think this paper makes the really important statement that the scale is achievable.”

Single nucleotide changes can affect different properties of the protein, such as their interactions with other cellular molecules, their abundance, or their functional repertoire. Since many disease-causing variants likely diminish protein stability, and consequently abundance, the researchers focused on quantifying these properties in their study. To perform this analysis for millions of domains, Beltran and the team relied on the genetic amenability of yeast. They attached each protein domain variant to an enzyme that is essential for yeast and quantified cell growth rate as a proxy for domain stability. By comparing protein abundance before and after the cells underwent a period of growth, Beltran and his team could identify which missense variants reduced protein stability. This strategy allowed him to pool and analyze hundreds of thousands of variants of diverse proteins in a single experiment. He observed that mutations present in the deep confines of the domains are more destabilizing than those on the surface and that mutations to proline are the most detrimental.

Next, the team explored whether variants that destabilize proteins are more likely to cause diseases. Human Domainome 1 consists of 621 known pathogenic variants, which increase the risk of developing certain disorders in the individuals that carry them. Beltran and his colleagues observed that amino acid alterations cause protein destabilization in 61 percent of pathogenic variants and 40 percent of benign variants. For some domain types, such as the sterile alpha motif and LIM domain 2, changes in stability strongly predicted that the variants could cause diseases.

“Even though these are different proteins that cause different diseases, the loss of stability is like a unifying mechanism,” Beltran said. “This is important because if we have a general way to re-stabilize them, it could be an important way forward to treat these diseases.” However, Roth points out that if there are so many variants that are not harmful, yet destabilize the protein, it suggests that the threshold set by the study authors for what is considered meaningful destabilization might not be high enough. “The most interesting question is ‘what fraction of pathogenic variants are pathogenic because they lost stability?’” he said.

The researchers then homed in on domains that have clinically-verified variants and quantified the relationship between protein stability and disease propensity. They found that protein stability plays a role in recessive disorders and in conditions such as cataracts and facial clefts, but did not contribute significantly to disorders like Rett syndrome and retinal dystrophy. The major cause of these disorders is mutations in a protein’s DNA-binding sequence, which Beltran and his team found did not cause strong destabilization of the molecules.

Moving forward, Beltran plans to expand the study to include longer domains and possibly entire proteins and measure other properties such as protein-protein interactions and aggregations. “We want to be able to predict any molecular functions from protein sequences alone,” he said.

  1. Shirts BH, et al. Family-specific variants and the limits of human genetics. Trends Mol Med. 2016;22(11):925-934.
  2. Lek M, et al. Analysis of protein-coding genetic variation in 60,706 humans. Nature. 2016;536(7616):285-291.
  3. Karbassi I, et al. A standardized DNA variant scoring system for pathogenicity assessments in Mendelian disorders. Hum Mutat. 2016;37(1):127-134.
  4. Beltran A, et al. Site-saturation mutagenesis of 500 human protein domains. Nature. 2025; 637(8047):885-894.

Keywords

Meet the Author

  • Sahana Sitaraman, PhD

    Sahana Sitaraman, PhD

    Sahana is a science journalist and an intern at The Scientist, with a background in neuroscience and microbiology. She has previously written for Live Science, Massive Science, and eLife.
Share
You might also be interested in...
Loading Next Article...
You might also be interested in...
Loading Next Article...
Image of a woman in a microbiology lab whose hair is caught on fire from a Bunsen burner.
April 1, 2025, Issue 1

Bunsen Burners and Bad Hair Days

Lab safety rules dictate that one must tie back long hair. Rosemarie Hansen learned the hard way when an open flame turned her locks into a lesson.

View this Issue
Conceptual image of biochemical laboratory sample preparation showing glassware and chemical formulas in the foreground and a scientist holding a pipette in the background.

Taking the Guesswork Out of Quality Control Standards

sartorius logo
An illustration of PFAS bubbles in front of a blue sky with clouds.

PFAS: The Forever Chemicals

sartorius logo
Unlocking the Unattainable in Gene Construction

Unlocking the Unattainable in Gene Construction

dna-script-primarylogo-digital
Concept illustration of acoustic waves and ripples.

Comparing Analytical Solutions for High-Throughput Drug Discovery

sciex

Products

Atelerix

Atelerix signs exclusive agreement with MineBio to establish distribution channel for non-cryogenic cell preservation solutions in China

Green Cooling

Thermo Scientific™ Centrifuges with GreenCool Technology

Thermo Fisher Logo
Singleron Avatar

Singleron Biotechnologies and Hamilton Bonaduz AG Announce the Launch of Tensor to Advance Single Cell Sequencing Automation

Zymo Research Logo

Zymo Research Launches Research Grant to Empower Mapping the RNome