eavesdropping | theDiagonal

Good vibrations

Despite the differences in the technology involved, both approaches rely on the same principle: sound travels on waves of higher and lower pressure in the air. When these waves reach a flexible object, they set off small vibrations in the object. If you can detect these vibrations, it’s possible to reconstruct the sound. Laser-based systems detect the vibrations by watching for changes in the reflections of the laser light, but researchers wondered whether you could simply observe the object directly, using the ambient light it reflects. (The team involved researchers at MIT, Adobe Research, and Microsoft Research.)

The research team started with a simple test system made from a loudspeaker playing a rising tone, a high-speed camera, and a variety of objects: water, cardboard, a candy wrapper, some metallic foil, and (as a control) a brick. Each of these (even the brick) showed some response at the lowest end of the tonal range, but the other objects, particularly the cardboard and foil, had a response into much higher tonal regions. To observe the changes in ambient light, the camera didn’t have to capture the object at high resolution—it was used at 700 x 700 pixels or less—but it did have to be high-speed, capturing as many as 20,000 frames a second.

Processing the images wasn’t simple, however. A computer had to perform a weighted average over all the pixels captured, and even a twin 3.5GHz machine with 32GB of RAM took more than two hours to process one capture. Nevertheless, the results were impressive, as the algorithm was able to detect motion on the order of a thousandth of a pixel. This enabled the system to recreate the audio waves emitted by the loudspeaker.

Most of the rest of the paper describing the results involved making things harder on the system, as the researchers shifted to using human voices and moving the camera outside the room. They also showed that pre-testing the vibrating object’s response to a tone scale could help them improve their processing.

But perhaps the biggest surprise came when they showed that they didn’t actually need a specialized, high-speed camera. It turns out that most consumer-grade equipment doesn’t expose its entire sensor at once and instead scans an image across the sensor grid in a line-by-line fashion. Using a consumer video camera, the researchers were able to determine that there’s a 16 microsecond delay between each line, with a five millisecond delay between frames. Using this information, they treated each line as a separate exposure and were able to reproduce sound that way.

Read the entire article here.

Image courtesy of Google Search.

Many of your mobile devices already know where you are and what you’re doing. Increasingly the devices you use will record your every step and every word (and those of any callers), and even know your mood and health status. Analysts and eavesdroppers at the U.S. National Security Agency (NSA) must be licking their collective their lips.

From Technology Review:

The Moto X, the new smartphone from Google’s Motorola Mobility, might be remembered best someday for helping to usher in the era of ubiquitous listening.

Unlike earlier phones, the Moto X includes two low-power chips whose only function is to process data from a microphone and other sensors—without tapping the main processor and draining the battery. This is a big endorsement of the idea that phones could serve you better if they did more to figure out what is going on (see “Motorola Reveals First Google-Era Phone”). For instance, you might say “OK Google Now” to activate Google’s intelligent assistant software, rather than having to first tap the screen or press buttons to get an audio-processing function up and running.

This brings us closer to having phones that continually monitor their auditory environment to detect the phone owner’s voice, discern what room or other setting the phone is in, or pick up other clues from background noise. Such capacities make it possible for software to detect your moods, know when you are talking and not to disturb you, and perhaps someday keep a running record of everything you hear.

“Devices of the future will be increasingly aware of the user’s current context, goals, and needs, will become proactive—taking initiative to present relevant information,” says Pattie Maes, a professor at MIT’s Media Lab. “Their use will become more integrated in our daily behaviors, becoming almost an extension of ourselves. The Moto X is definitely a step in that direction.”

Even before the Moto X, there were apps, such as the Shazam music-identification service, that could continually listen for a signal. When users enable a new feature called “auto-tagging” on a recent update to Shazam’s iPad app, Shazam listens to everything in the background, all the time. It’s seeking matches for songs and TV content that the company has stored on its servers, so you can go back and find information about something that you might have heard a few minutes ago. But the key change is that Shazam can now listen all the time, not just when you tap a button to ask it to identify something. The update is planned for other platforms, too.

But other potential uses abound. Tanzeem Choudury, a researcher at Cornell University, has demonstrated software that can detect whether you are talking faster than normal, or other changes in pitch or frequency that suggest stress. The StressSense app she is developing aims to do things like pinpoint the sources of your stress—is it the 9:30 a.m. meeting, or a call from Uncle Hank?

Similarly, audio analysis could allow the phone to understand where it is—and make fewer mistakes, says Vlad Sejnoha, the chief technology officer of Nuance Communications, which develops voice-recognition technologies. “I’m sure you’ve been in situation where someone has a smartphone in their pocket and suddenly a little voice emerges from the pocket, asking how they can be helped,” he says. That’s caused when an assistance app like Apple’s Siri is accidentally triggered. If the phone’s always-on ears could accurately detect the muffled acoustical properties of a pocket or purse, it could eliminate this false start and stop phones from accidentally dialing numbers as well. “That’s a work in progress,” Sejnoha says. “And while it’s amusing, I think the general principle is serious: these devices have to try to understand the users’ world as much as possible.”

A phone might use ambient noise levels to decide how loud a ringtone should be: louder if you are out on the street, quiet if inside, says Chris Schmandt, director of the speech and mobility group at MIT’s Media Lab. Taking that concept a step further, a phone could detect an ambient conversation and recognize that one of the speakers was its owner. Then it might mute a potentially disruptive ringtone unless the call was from an important person, such as a spouse, Schmandt added.

Read the entire article here.

theDiagonal

Tag Archives: eavesdropping

Privacy and Potato Chips

Good vibrations

Listening versus Snooping