Watching TV can be a very educational experience for a computer.
In a paper that will be presented this week at the International Conference on Computer Vision and Pattern Recognition, researchers at MIT’s Computer Science and Artificial Intelligence Lab (CSAIL) created an algorithm that can predict how humans will behave in certain situations.
The algorithm ‘watched’ 600 hours of TV shows culled from clips posted on YouTube, including The Office, Big Bang Theory, and Desperate Housewives. The purpose was to see if it could accurately predict what humans would do during an interaction–would they shake hands? High five? Hug? Kiss? After feeding it the background material, the researchers had the algorithm watch new clips, and froze the clip just before an action was about to happen, and asked the algorithm to predict what happened next. 43 percent of the time, it was able to correctly identify what happened next.
That’s worse than humans, who were able to correctly predict what would happen 71 percent of the time. But still, for a computer, that’s pretty good–better than the 36 percent that other, similar experiments found.
Eventually, this could lead to artificial intelligence that is better able to react to humans or even security cameras that could alert authorities when people are in need of help. (In an alternative, more dystopian scenario, we could imagine computers being able to predict human behavior could lead to an AI version of Minority Report.)
“I’m excited to see how much better the algorithms get if we can feed them a lifetime’s worth of videos,” says lead author Carl Vondrick. “We might see some significant improvements that would get us closer to using predictive-vision in real-world situations.”
Previous adventures at CSAIL have resulted in jumping cube exploring robots, algorithms that predict how objects will behave, a 3D-printed robot with a liquid center, and, of course, a town of self-driving rubber ducks.