SHARE

If you want your humanoid robot to realistically simulate facial expressions, it’s all about timing. And for the past five years, engineers at Columbia University’s Creative Machines Lab have been honing their robot’s reflexes down to the millisecond. Their results, detailed in a new study published in Science Robotics, are now available to see for yourself.

Meet Emo, the robot head capable of anticipating and mirroring human facial expressions, including smiles, within 840 milliseconds. But whether or not you’ll be left smiling at the end of the demonstration video remains to be seen.

AI photo

AI is getting pretty good at mimicking human conversations—heavy emphasis on “mimicking.” But when it comes to visibly approximating emotions, their physical robots counterparts still have a lot of catching up to do. A machine misjudging when to smile isn’t just awkward–it draws attention to its artificiality. 

Human brains, in comparison, are incredibly adept at interpreting huge amounts of visual cues in real-time, and then responding accordingly with various facial movements. Apart from making it extremely difficult to teach AI-powered robots the nuances of expression, it’s also hard to build a mechanical face capable of realistic muscle movements that don’t veer into the uncanny.

[Related: Please think twice before letting AI scan your penis for STIs.]

Emo’s creators attempt to solve some of these issues, or at the very least, help narrow the gap between human and robot expressivity. To construct their new bot, a team led by AI and robotics expert Hod Lipson first designed a realistic robotic human head that includes 26 separate actuators to enable tiny facial expression features. Each of Emo’s pupils also contained high-resolution cameras to follow the eyes of its human conversation partner—another important, nonverbal visual cue for people. Finally, Lipson’s team layered a silicone “skin” over Emo’s mechanical parts to make it all a little less.. you know, creepy.

From there, researchers built two separate AI models to work in tandem—one to predict human expressions through a target face’s minuscule expressions, and another to quickly issue motor responses for a robot face. Using sample videos of human facial expressions, Emo’s AI then learned emotional intricacies frame-by-frame. Within just a few hours, Emo was capable of observing, interpreting, and responding to the little facial shifts people tend to make as they begin to smile. What’s more, it can now do so within about 840 milliseconds.

“I think predicting human facial expressions accurately is a revolution in [human-robot interactions,” Yuhang Hu, Columbia Engineering PhD student and study lead author, said earlier this week. “Traditionally, robots have not been designed to consider humans’ expressions during interactions. Now, the robot can integrate human facial expressions as feedback.”

Right now, Emo lacks any verbal interpretation skills, so it can only interact by analyzing human facial expressions. Lipson, Hu, and the rest of their collaborators hope to soon combine the physical abilities with a large language model system such as ChatGPT. If they can accomplish this, then Emo will be even closer to natural(ish) human interactions. Of course, there’s a lot more to relatability than smiles, smirks, and grins, which the scientists appear to be focusing on. (“The mimicking of expressions such as pouting or frowning should be approached with caution because these could potentially be misconstrued as mockery or convey unintended sentiments.”) However, at some point, the future robot overlords may need to know what to do with our grimaces and scowls.