Invention Awards: The World as a Web Interface

SixthSense turns your surroundings into a gesture-controlled computer interface
Using SixthSense, grad student Pranav Mistry can operate his laptop, snap photos, and more with hand signals alone John B. Carnett

Remember that awesome scene in Minority Report_ when Tom Cruise just wiggles his hands in the air to sift through information? Today’s featured Invention Award winner brings it to life._

When he’s wearing the SixthSense, a combination miniature projector, webcam and notebook computer, Pranav Mistry can snap photos just by making the shape of a frame with his fingers.

He can conjure a phone keypad in the palm of his hand and tap the virtual numbers to place a call. The system can even recognize a book in front of the camera, retrieve its Amazon listing from the Web, and project its rating on the cover. Watching Mistry, a graduate student in the Massachusetts Institute of Technology’s Media Arts and Sciences program, demonstrate the device is like witnessing a magic show. But he and his adviser, Pattie Maes, a digital-interface specialist at MIT’s Media Lab, expect the SixthSense to do a lot more than evoke wonder. Within a few years, they hope, it will let people operate smartphones without touching a button, do instant research on objects around them, and generally offer the kind of enhanced-reality experience that’s now confined to science fiction.

Maes hit on the idea last October while discussing g-speak, a real-world version of the gesture-controlled interface in the movie Minority Report. She liked the notion of using hand signals to manipulate digital content but wanted something cheaper that you could walk around with, projecting content and interacting with it anywhere you liked. Mistry, nicknamed “Zombie” because of his aversion to sleep, turned out a prototype in just three weeks.

Although the system has evolved considerably since then, the basic concept has stuck. A pocket projector and a webcam hang on Mistry’s chest, both wired to a laptop in his backpack, and he wears four different-colored marker caps or pieces of tape on his thumbs and index fingers. When he switches on the system, the webcam starts capturing video and streaming it back to the computer. Then the computer’s vision algorithms take over. The real brains of this system, this software filters out background imagery, determines x and y coordinates for each cap or tape color in the video frame, and tracks them over time. The computer discerns which colors are moving which way, so it can follow freehand gestures. These, in turn, trigger various functions.

Say, for instance, Mistry wants to know the time. He traces a small circle on his wrist with his index finger, and the computer tracks the red marker cap or piece of tape, recognizes the gesture, and instructs the projector to flash the image of a watch onto his wrist. For book-recognition, Mistry activates the program with a gesture, and the system snaps a photo of the book, compares it with book-cover images it finds online, computes a match, and retrieves and projects the ratings. Future functions will similarly rely on computer vision algorithms. “It recognizes what’s in front of the user and augments those things with relevant information,” Maes explains.

This summer, Mistry will begin working with Samsung engineers to compress the entire system into one of the company’s new smartphones, which has a built-in projector. With further improvements to the algorithms, eventually even the markers and tape could go away and the device could track fingers alone, making it even easier to enhance your surroundings anywhere you go.

Here’s a video from the TED conference of Pattie Maes presenting SixthSense:

Invention: SixthSense
Inventor: Pattie Maes and Pranav Mistry
Cost: $350
Time: 8 months
Is It Ready Yet? **1 2 3 4 5

Check out the rest of PopSci’s 2009 Invention Award winners!