Deva Ramanan clicks a button on his MacBook Air and a video begins to play: Michelle Kwan skating in the 1998 Nagano Olympics. Next to it, a computer program renders what it "sees" in the footage: Kwan's head, legs, torso, upper arms, and forearms, all distinguished by different colors. Ramanan, a computer scientist at University of California at Irvine, trains computers to recognize three-dimensional humans in flat photography.
Face-recognition software, which pinpoints the classic eyes-nose-mouth configuration, has been in use for years. But detecting a human body—any human body—is much more challenging for computers due to the endless variety of possible poses, angles, sizes, and outfits. Most researchers will feed a program millions of images to memorize, building a vast database of people. Ramanan, instead, trained his computer program to identify body parts and match them to a flexible human template. "You can think of it as a divide-and-conquer approach," he says. The software runs through a checklist: Arms, torso, legs? Check. Thus, a human. Ramanan's method is much faster and uses less processing power than the traditional one.
Ramanan envisions many potential applications for his people-finding algorithms, including fast and accurate pedestrian-detection systems in self-driving cars and videogame systems that track full-body movements. In the meantime, he's focusing on teaching computers how to read and understand context—in other words, to think. "What if you really want to understand what a person is doing," Ramanan asks. "Not just [understand] 'Here's the arm,' but 'This person is waiting for a bus.' " If his future projects succeed, computers' reasoning ability will keep inching closer to that of the human brain itself.
Click here to see more from our 11th annual celebration of young researchers whose innovations will change the world