Computer Program Converts Brain Signals to a Synthetic Voice
Computer Program Converts Brain Signals to a Synthetic Voice

Computer Program Converts Brain Signals to a Synthetic Voice

A proof-of-principle study raises hopes that technology can give a voice to paralyzed people unable to speak.

Apr 24, 2019
David Adam

ABOVE: Coauthor Gopala Anumanchipalli holds the type of intracranial electrode array used in the study.
UCSF

A new computer program translates brain signals into language. The technology tracks the electrical messages passed to muscles in and around the mouth to decode what the brain is trying to say. Further tests are needed, but the developers say it could be used to design brain implants to help people who have suffered a stroke or brain disease communicate.

“We want to create technologies that can reproduce speech directly from human brain activity,” Edward Chang, a neurosurgeon at the University of California, San Francisco, who led the research, said during a press conference. “This study provides a proof of principle that this is possible.” He and his colleagues describe the results in Nature today (April 24).

The technique is highly invasive and relies on electrodes placed deep in the brain. As such, it has only been tested so far on five people with epilepsy who have had the electrodes fitted as part of their treatment. These people could—and did—speak during the tests, and this allowed the computer to work out the associated brain signals. The scientists must now check if it works in people who cannot speak.

That will probably be more difficult, says Nick Ramsey, a neuroscientist at the University Medical Center Utrecht in the Netherlands, who works on brain implants to help people with locked-in syndrome communicate, despite widespread paralysis of their muscles. “It’s still an open question whether you will be able to get enough brain data from people who can’t speak to build your decoder,” but he says the study is “elegant and sophisticated” and the results show promise. “I’ve followed their work for a couple of years and they really understand what they’re doing.”  

Speech is one of the most complex motor actions in the human body. It requires precise neural control and coordination of muscles across the lips, tongue, jaw, and larynx. To decode this activity, the scientists used the implanted electrodes to track signals sent from the brain when the volunteers read aloud a series of sentences. A computer algorithm analyzed these instructions using a pre-existing model of how the vocal tract moves to make sounds. A second, processing stage then converted these predicted movements into spoken sentences.

This two-stage approach—translating brain activity to motor movements and then motor movements into words—produces less distortion than trying to directly convert brain signals to speech, Chang says. When the team played 101 synthesized sentences to listeners and asked them to identify the spoken words from a 25-word list, they transcribed 43 percent of them accurately.

Qinwan Rabbani, a graduate student who works on similar systems at Johns Hopkins University, has listened to the synthesized sentences and says they’re good, especially as the computer only had a dozen or so minutes of speech to analyze. Algorithms that decode speech typically need “days or weeks” of audio recordings, he says.

Brain signals that control speech are more complicated to decode than those used to, say, move arms and legs, and more easily influenced by emotional state and tiredness. That means a synthetic speech system eventually applied to paralyzed patients would probably be restricted to a limited set of words, Rabbani says.

G.K. Anumanchipalli et al., “Speech synthesis from neural decoding of spoken sentences,” Nature, doi:10.1038/s41586-019-1119-1, 2019.