Putting Words in Our Mouths

Movie special effects artists have learned how to put people into photos and videos, even if they were never there, and make animals talk. They can insert dead people into advertisements, such as the recent TV ads featuring Fred Astaire and John Wayne. Gareth Cook of the Boston Globe writes that now scientists at the Massachusetts Institute of Technology have created the first realistic videos of people saying things they never said.

The researchers taped a woman speaking into a camera, and then reprocessed the footage into a new video that showed her speaking entirely new sentences, and even mouthing words to a song in Japanese, a language she doesn?t speak. The results were good enough to fool viewers.

The technique’s inventors say it could be used in video games and movie special effects. Directors could reanimate Marilyn Monroe or other dead film stars and have them say new lines. It could also be used to improve dubbed movies.

But scientists warn the technology will also provide a powerful new tool for fraud, and will cast doubt on everything from video surveillance to presidential addresses. ”This is really groundbreaking work,” says Demetri Terzopoulos, a leading specialist in facial animation who is a professor of computer science and mathematics at New York University. But he says, ”we are on a collision course with ethics. If you can make people say things they didn’t say, then potentially all hell breaks loose.”

”There is a certain point at which you raise the level of distrust to where it is hard to communicate through the medium,” says Kathleen Hall Jamieson, dean of the Annenberg School for Communication at the University of Pennsylvania. ”There are people who still believe the moon landing was staged.”

Currently, the MIT method is limited: It works only on video of a person facing a camera and not moving much, like a newscaster. The technique only generates new video, not new audio. But it won?t be long before they learn how to work on a moving head at any angle, according to Tomaso Poggio, a neuroscientist at the McGovern Institute for Brain Research, who is on the MIT team and runs the lab where the work is being done.

”It is only a matter of time before somebody can get enough good video of your face to have it do what they like,” says Matthew Brand, a research scientist at MERL, a Cambridge-based laboratory for Mitsubishi Electric.

Previous work has focused on creating a virtual model of a person’s mouth, then using a computer to render digital images of it as it moves. But the new software relies on the application of artificial intelligence to teach a machine what a person looks like when talking. A minimum of between two and four minutes of video of a talking person is needed for the effect to work, so the computer captures images which represent the full range of motion of the mouth and surrounding areas. The computer is then able to express any face as a combination of these, the same way that any color can be represented by a combination of red, green, and blue. The computer goes through the video, learning how a person expresses every sound, and how it moves from one to the next. Given a new sound, the computer can then generate an accurate picture of the mouth area and virtually superimpose it on the person’s face.

Right now, this method only seems lifelike for a sentence or two at a time, because over longer stretches, the speaker seems to lack emotion. Tony Ezzat, the graduate student who heads the MIT team, says he wants to develop a more complex model that would teach the computer to simulate basic emotions.

A specialist can still detect the video forgeries, but as the technology improves, scientists predict that video authentication will become a growing field, just like the authentication of photographs. ”We will probably have to revert to a method common in the Middle Ages, which is eyewitness testimony,” says the Jamieson. ”And there is probably something healthy in that.”

How can we find out the truth when the media is covering it up? Read ?Into the Buzzsaw? by Kristina Borjesson, click here.

To see if you can tell the difference between the real video clips and the synthetic ones created by the new MIT software, click here.

NOTE: This news story, previously published on our old site, will have any links removed.

Dreamland Video podcast

To watch the FREE video version on YouTube, click here.

Subscribers, to watch the subscriber version of the video, first log in then click on Dreamland Subscriber-Only Video Podcast link.

Social

Putting Words in Our Mouths