MIT AI Looks at Still Images, Predicts What Will Happen Next

...

The human brain's capacity for imagination is boundless. But machine intelligence still struggles to form images, ideas, and sensations without direct input.

Researchers from MIT's Computer Science and Artificial Intelligence Laboratory (CSAIL), however, led by PhD student Carl Vondrick, has developed a deep-learning algorithm that creates short videos to simulate the future of a still image, like a beach with crashing waves or golfers walking along the grass.

"These videos show us what computers think can happen in a scene," Vondrick said in a statement. "If you can predict the future, you must have understood something about the present."

Vondrick worked with MIT professor Antonio Torralba and University of Maryland Baltimore County professor Hamed Pirsiavash on the project. They are not the first to tackle this topic, but their model does pioneer new techniques—like processing an entire scene at once.

"Building up a scene frame-by-frame is like a big game of 'Telephone,' which means that the message falls apart by the time you go around the whole room," Vondrick said. "By instead trying to predict all frames simultaneously, it's as if you're talking to everyone in the room at once."

Using the "adversarial learning" method, the team trained two competing neural networks: one to generate video, the other to discriminate between what is real and what is fabricated. Over time, the generator learns to fool the discriminator, thus creating videos that resemble actual scenes from beaches, train stations, hospitals, and golf courses.

When put to the test, the algorithm generated videos that human subjects deemed realistic 20 percent more often than a baseline model.

But don't expect to see AI-conceived blockbusters on the big screen any time soon. The platform lacks some common sense, and is still too complex for anything longer than 1.5 seconds. Vondrick, however, has high hopes for the eventual production of longer-form videos.

"It's difficult to aggregate accurate information across long time periods in videos," he said. "If the video has both cooking and eating activities, you have to be able to link those two together to make sense of the scene."

According to MIT, this model could be used for adding animation to still images, detecting anomalies in security footage, and compressing data to store and send longer clips. "In the future, this will let us scale up vision systems to recognize objects and scenes without any supervision, simply by training them on video," Vondrick said.

Categories
GAMES
0 Comment

Leave a Reply

Captcha image


RELATED BY

  • 5300c769af79e

    Data Analytics with Hadoop: O'Reilly Ebook

    Download This book is meant as a survey of the Hadoop ecosystem and distributed computation intended to arm data scientists, statisticians, programmers, and folks who are interested in Hadoop with just enough knowledge to make them dangerous.Use this book as a guide as you dip your toes into the world of Hadoop and find the tools and techniques that interest you the most, be it Spark, Hive, Machine Learning, ETL, Relational Databases, or one of the other many topics related to cluster computing.
  • 5300c769af79e

    The New Normal: Cloud, DevOps, and SaaS Analytics Tools Reign in the Modern App Era

    Download State of IT Operations and Modern App ToolsAs trends such as cloud computing and DevOps become the de facto standard, organizations are increasingly looking for next-generation analytics tools and services that provide continuous intelligence to help them build, run, and secure modern applications, and to accelerate their journey to the cloud.However, they struggle with challenges related to security, siloed tools, and customization.
  • 5300c769af79e

    T-Mobile Extends Offer of Free High-Speed Data in South America and Europe Until 2017

    Specifically, customers on a post-paid account receive this benefit at no extra cost, while also not having to activate or deactivate the feature.Now, do note that T-Mobile’s “unlimited” LTE data does not qualify for users attempting to tether.
  • 5300c769af79e

    Small World 2 (for Android)

    ) but still have a lust for conquest, consider Small World 2.Note that it doesn't include any of the elements from the Small World Underground or Small World Realms expansions to the physical game.