Visualizing the Vibe

Retrieving sound from video recordings of inanimate objects can have surprising applications.

Oct 1, 2014
Jyoti Madhusoodanan

SEEING SOUND: Researchers recovered human speech by analyzing high-speed video recordings of a bag of chips vibrating in response to speech from a cell phone in the same room.ABE DAVIS

Watch what you say. Nearly everything around you—from potted plants to a bag of chips—is catching your vibes. Sound waves pinging off the surfaces of these objects cause tiny vibrations invisible to the naked eye. But a group of MIT researchers led by William Freeman has devised a way to spot these movements on a high-speed video recording and use them to reconstruct the sound that triggered them. Their technique, presented at a meeting in August, effectively turns a variety of everyday objects into visual microphones.

Retrieving speech or song from footage of a shiny piece of foil may seem like the stuff of spy movies, but by magnifying minuscule movements, researchers could do some surprising, far-fetched things. “Our labs have been doing this work on amplifying and visualizing small motions in video for a while,” explains graduate student Abe Davis, who participated in the recent study.

Sounds cause ripples in air pressure that can make surrounding objects move. “Sound is just a motion that travels in the fluid of air,” says David Stoker of SRI International (until 1977 the Stanford Research Institute) in California who was not involved with the study. “Being able to see vibrations is a fundamental tool in doing science.”

TINY MICS: Researchers were also able to use silent high-speed video recorded from a distance to identify music playing through a pair of earbuds simply by analyzing minuscule vibrations of the buds.ABE DAVISTo identify the original sound that made objects vibrate, Davis and his colleagues analyzed the video recordings and calculated the amount of motion at every pixel, orientation, and scale of the image. The researchers aligned these signals to create a single, global picture of the object’s motion, and finally, filtered this vibration to recover audio. Although they tested the technique on things ranging from bricks to roses, teapots to crumpled-up foil, the method worked best on well-lit, thin surfaces that provided plenty of contrast.

“I talked at a lot of inanimate objects. Chips, plates, cups, hair   . . . [it was] actually pretty embarrassing,” says Davis. Refining these early results enabled the group to apply its techniques to less contrived situations. Eventually, the team reproduced the notes of “Mary Had a Little Lamb,” played on a simple speaker, from the barely noticeable movements of a potted plant in the same room.

In previous work, Freeman and his colleagues extracted heart-rate data from color changes in a person’s face caused by blood flow, finding that asymmetries in facial blood circulation may reveal deeper arterial malfunctions. The group also demonstrated how newborn babies’ vital signs could be detected remotely by filming and amplifying the movement of blood under their skin. These techniques offer the unique advantage of being completely passive, according to Davis. Whether monitoring ambient sound or a newborn’s heartbeat, all that’s required is a video recording of a leaf or a cheek (ACM Transactions on Graphics, doi: 10.1145/2185520.2185561, 2012).

The new work is “a neat paper,” says computer-vision researcher Jon Barron of Google[x] who was not involved with the study. “It’s really exciting [that they] discovered and solved a problem that no one knew existed.”

It’s less certain whether the visual-microphone method has the diagnostic potential possessed by Freeman’s earlier studies. “From a scientific perspective it’s brilliant,” says medical acoustics researcher Tyrone Porter of Boston University who was not involved with this project. He adds, however, that it was hard to “think of where it would be used that’s more efficient than [other methods].”

Sound is just a motion that travels in the fluid of air. Being able to see vibrations is a fundamental tool in doing science.—David Stoker, SRI International

Porter suggests the technique may help researchers studying physical processes in cultured cells, both microbial and human. Hints that physical vibrations—including sound—may serve as a means of intercellular microbial communication have emerged in recent years. The ideas stem from experiments that found sound waves might stimulate bacterial growth; one early study suggested that Bacillus subtilis produced reproducible sound vibrations. Others have suggested that electrical or electromagnetic currents may also play a role in single-cell communications. Pending confirmation, many of these hypotheses remain controversial (Trends Microbiol, 19:105-13, 2011).

“The notion of sound waves propagating between cells and that being a form of communication between cells is very unique and different,” says Porter. “This [filming technique] could have applications there because trying to capture sound waves with traditional pressure transducers would just be really difficult.”

Whether the visual-microphone method will allow researchers to eavesdrop on microbial chatter is still unknown. The technique is one that hadn’t been seriously considered before, according to Barron. “Now that we know [retrieving sound in this way] is possible, there’s a lot of excitement about what we can do in this space,” he says. The most interesting applications, Barron adds, are likely to be “in the ideas it spawns that aren’t necessarily obvious to us yet.”