Researchers at Carnegie Mellon have so far created the most believable "deepfakes"

Ever heard of "deepfakes"? The AI, which imposes the face of one person on the body of another, was used to replace Harrison Ford with Nicholas Cage in countless video clips, as well as for more vile purposes: celebrities appeared in porn and propaganda without their knowledge. Now, for better or worse, researchers at Carnegie Mellon University have developed a new, more powerful and versatile system.

It is called "Recycle-GAN". It is a system for transforming the contents of one video or photo in the likeness of another, which studies exclusively on input unpartitioned data (learning without a teacher). “The task of changing the content while preserving the style of the original has many uses, for example, imposing movements and facial expressions of one person on another, learning robots using the“ do-it-yourself ”method, the researchers say, or converting black-and-white videos to color.

Until now, even the most advanced transformation methods have been aimed at human faces, and according to the researchers, “it was almost impossible to apply them in other areas”, besides “they work very poorly with partially hidden faces”. Other methods use a frame-by-frame transformation, which requires time-consuming manual marking and alignment of data.


Recycle-GAN uses generative-adversary networks (GAN) and “space-time markers” to “link” two pictures or videos. (GANs are models consisting of a generator that tries to “trick” the discriminator, producing more and more realistic results from the input data.) When training in video with people, they create videos with such subtle moments as dimples that form when smiling and lip movement.

“Without any intervention and initial knowledge related to the specifics of the video, our approach is able to learn simply using publicly available subject videos from the Internet,” the development team writes.

Recycle-GAN is capable of much more than just transferring facial expressions. Researchers used it to change the weather conditions in the video, converting complete calm on a windy day. They imitated blooming and dying flowers, and synthesized a compelling sunrise from videos on the Internet.

The test results are quite good: the system managed to deceive 15 subjects in 28.3% of cases, but the team believes that the products of future versions of the system may be more likely if they take into account the playback speed, for example, how much faster or slower people say in the video

" A plausible transfer of style should be able to take into account even the time difference resulting from the reproduction of speech / content, ”the team wrote. “We believe that the best spatiotemporal neural network architecture can solve this problem in the near future.”

Not surprisingly, deepfakes remain a hotly debated hot issue. Publicly available services make their creation relatively easy, and there is no legal basis for protecting victims of such videos.

Reddit, Pornhub, Twitter and others have taken a stand against them, and researchers (the most recently joined US DoM) continue to look for ways to detect deepfakes. But, as Eric Goldman, a law professor at the University of Santa Clara and director of the Institute of High Technologies, said recently, it’s best to “get ready to live in a world where real and fake photos and videos will surround us everywhere.”

Also popular now: