Researchers Decode How Protein Language Models Think, Making AI More Transparent

By spreading out tightly packed information in neural networks, a new set of tools could make AI protein models easier to understand.

Written byAndrea Lius, PhD
| 5 min read
A cartoon blue man, representing sparse autoencoders and the AI tools that work with them, untangles black jumbled thread, representing the convoluted and densely packed information in the neural networks of protein language models (PLMs).
Register for free to listen to this article
Listen with Speechify
0:00
5:00
Share

For large language models (LLMs) like ChatGPT, accuracy often means complexity. To be able to make good predictions, ChatGPT must deeply understand the concepts and features that are associated with each word—but how it gets to this point is typically a black box.

Similarly, protein language models (PLMs), which are LLMs used by protein scientists, are dense with information. Scientists often have a hard time understanding how these models solve problems, and as a result, they struggle to judge the reliability of the models’ predictions.

Bonnie Berger poses with her left hand propped under her chin and her right hand on a countertop. She’s wearing a red and white sleeveless shirt with floral patterns.

Bonnie Berger is a mathematician and computer scientist at the Massachusetts Institute of Technology. She’s interested in using large language models to study proteins.

Bonnie Berger

“These models give you an answer, but we have no idea why they give you that answer,” said Bonnie Berger, a mathematician and computer scientist at the Massachusetts Institute of Technology. Because it’s difficult to assess the models’ performance, “people either put zero trust or all their trust in these protein language models,” Berger said. She believes that one way to calm these qualms is to try to understand how PLMs think.

Recently, Berger’s team applied a tool called sparse autoencoders, which are often used to make LLMs more interpretable, to PLMs.1 By making the dense information within PLMs sparser, the researchers could uncover information about a protein’s family and its functions from a single sequence of amino acids. This work, published in the Proceedings of the National Academy of Sciences, may help scientists better understand how PLMs come to certain conclusions and increase researchers’ trust in them.

James Fraser poses in front of a blurred background of a laboratory bench. He’s wearing a dark grey tee underneath a blue/green checkered shirt.

James Fraser is a biophysicist at the University of California, San Francisco who uses computational approaches to study protein conformation. He was not involved in the study.

James Fraser

“[This study] tells us a lot about what the models are picking up on,” said James Fraser, a biophysicist at the University of California, San Francisco who was not involved in the study. “It’s certainly cool to get this kind of look under the hood of what was previously kind of a black box.”

Continue reading below...

Like this story? Sign up for FREE Newsletter updates:

Latest science news storiesTopic-tailored resources and eventsCustomized newsletter content
Subscribe

Berger thought that part of people’s excitement about PLMs had come from AlphaFold’s success. But while both PLMs and AlphaFold are AI tools, they work quite differently. AlphaFold predicts protein structure by aligning a lot of protein sequences. Models like these typically boast a high level of accuracy, but researchers must spend considerable time and resources to train them.

On the other hand, PLMs are designed to predict features of a protein, like how it interacts with other proteins, from a single sequence. PLMs learn the relationship between protein sequence and function instead of the relationship between different protein sequences. While they learn much faster, they may not be as accurate.

“When large language models that only take a single sequence came along, people thought, ‘We should believe this too,’” Berger said. “But now, they’re at the stage of, ‘Oh my gosh, they’re not always right.’” To know when PLMs are right or wrong, researchers first need to understand them.

PLMs are highly complex. Each neuron in the neural network—AI’s equivalent of a brain—is assigned to more than one discrete unit of information, called tokens. Conversely, multiple neurons often process each token.

Onkar Gujral poses in a park, in front of a building called “THE PARK”. He’s wearing glasses, a blue T-shirt underneath a black jacket, and a dark blue headwrap.

Onkar Gujral is a fifth-year mathematics PhD student at the Massachusetts Institute of Technology, advised by Bonnie Berger. He was the lead author of the study.

Onkar Gujral

“You store information in clusters of neurons, so the information is very tightly compressed,” said Onkar Gujral, a graduate student in Berger’s group who led the study. “Think of it as entangled information, and we need to find a way to disentangle this information.”

This is where the sparse autoencoders come in. They allow information stored in the neural network to spread out among more neurons. With less tightly packed information, researchers can more easily figure out which neuron in the network associates with which feature of a protein, much like how neuroscientists try to assign specific functions to brain regions.

Next, the team fed the processed information to Claude, an LLM, which added annotations such as the protein’s name, family, and related pathways. “By disentangling the information, we can now interpret what’s going on inside the protein language model,” Gujral said.

Fraser said, “This paper is among the first in a group of similar papers that came out roughly around the same time,” citing several preprint publications by other groups of researchers that also used sparse autoencoders to better understand PLMs.2-4

But Berger’s team didn’t think that disentangling information was enough. They also wanted to follow the models' train of thought. To do this, the researchers used transcoders, a variant of sparse autoencoders that track how information changes from one “layer” of the neural network to another. “It might give you the model’s logic of thinking—its change of thoughts—which can give you more confidence in its output,” Berger said.

Fraser thought that the quest to make PLMs more interpretable is a “really cool frontier,” but he still questions its practicality. “We’ve got AI interpreting AI. Then we need more AI to interpret that result—we're going down a rabbit hole,” he said. “It’s very, very hard to directly figure out what features the autoencoders are actually revealing.”

Berger, on the other hand, is confident that she’ll be able to put her tool to use. Her team previously developed a PLM to optimize antibody design for therapeutics and another to predict the interaction between drugs and their targets.5,6 She hopes to use sparse autoencoders and transcoders to better understand these models.

Related Topics

Meet the Author

  • Image of Andrea Lius.

    Andrea Lius is an intern at The Scientist. She earned her PhD in pharmacology from the University of Washington. Besides science, she also enjoys writing short-form creative nonfiction.

    View Full Profile
Share
You might also be interested in...
Loading Next Article...
You might also be interested in...
Loading Next Article...
Illustration of a developing fetus surrounded by a clear fluid with a subtle yellow tinge, representing amniotic fluid.
January 2026

What Is the Amniotic Fluid Composed of?

The liquid world of fetal development provides a rich source of nutrition and protection tailored to meet the needs of the growing fetus.

View this Issue
Human-Relevant In Vitro Models Enable Predictive Drug Discovery

Advancing Drug Discovery with Complex Human In Vitro Models

Stemcell Technologies
Redefining Immunology Through Advanced Technologies

Redefining Immunology Through Advanced Technologies

Ensuring Regulatory Compliance in AAV Manufacturing with Analytical Ultracentrifugation

Ensuring Regulatory Compliance in AAV Manufacturing with Analytical Ultracentrifugation

Beckman Coulter Logo
Conceptual multicolored vector image of cancer research, depicting various biomedical approaches to cancer therapy

Maximizing Cancer Research Model Systems

bioxcell

Products

Refeyn logo

Refeyn named in the Sunday Times 100 Tech list of the UK’s fastest-growing technology companies

Parse Logo

Parse Biosciences and Graph Therapeutics Partner to Build Large Functional Immune Perturbation Atlas

Sino Biological Logo

Sino Biological's Launch of SwiftFluo® TR-FRET Kits Pioneers a New Era in High-Throughout Kinase Inhibitor Screening

SPT Labtech Logo

SPT Labtech enables automated Twist Bioscience NGS library preparation workflows on SPT's firefly platform