Solve for Standing Ovation: Should AI Researchers Bother Building a TED-Bot?

TED@Tunis, part of the TED2013 Talent Search. Learn more at Photo: James Duncan Davidson James Duncan Davidson
TED@Tunis, part of the TED2013 Talent Search. Learn more at Photo: James Duncan Davidson James Duncan Davidson

The latest XPrize is, without a doubt, the strangest XPrize so far.

It’s called the A.I. XPrize presented by TED, a name that seems to say it all, but doesn’t quite. As announced last week, the competition is “a modern-day Turing test to be awarded to the first A.I. to walk or roll out on stage and present a TED Talk so compelling that it commands a standing ovation from you, the audience.”

The actual prize, however, is unspecified. And the rules presented are an example of what the contest might ultimately look like, based on ideas submitted by the general public.

Still, that one-sentence description, along with the sample rules, raise questions about the validity of the first artificial intelligence-related XPrize. What, if anything, does a machine-delivered TED Talk have to do with pushing the development of AI?

First, those sample rules. As the XPrize site specifies, “Elements of this concept may or may not be used,” but the details are nonetheless revealing. In the proposed scenario, teams would be given 100 discussion topics ahead of time. One of those topics would be selected (at random, or by the audience) at the TED conference, and the AI would have 30 minutes to come up with a 3 minute presentation. After the resulting TED Talk “the audience would vote with their applause and, if appropriate, with a standing ovation.”

Past XPrize competitions have involved fully empirical victory conditions—crossing a finish line first, or reaching suborbital altitudes twice in as many weeks. In this notional example, would collective decibel levels push a giant applause-meter toward the auditorium ceiling? And what if two robots get standing ovations at the same conference? Will organizers time the speed at which each ovation was achieved, or the howls emitted, or the tears produced?

Facetious as those questions might sound, the murky, talent-show mechanics are relevant to this next, and likely more challenging phase of the competition.

After the audience-rated TED Talk, the AI would have to answer two questions posed by Chris Andersen, former editor of Wired, and curator of the TED conferences. An expert panel would “add their votes.” This might be the tie-breaker for an ovation standoff. Or perhaps each expert’s applause will be magnified twenty-fold?

Suffice it to say, the AI XPrize doesn’t seem to have been cooked up by AI researchers. The sample rules betray a much looser, more entertainment-focused take on the XPrize, a kind of marketing stunt supported by real research.

Unless, that is, the researchers simply fake it. “I think the spirit of the challenge is quite cool and quite useful,” says Noah Goodman, a cognitive scientist at Stanford University. “But as it’s presented right now, it’s easily cheatable. You could get a bunch of artists and intellectuals, and make up a script for each of those 100 topics, and then play the right one back when you know which topic it is. Then you’ve solved the first part without using AI at all.”

A question-and-answer session wouldn’t be so easily gamed, but by then, the standing ovation will have already been granted or denied. It’s the rough equivalent of crowning Miss America, and then quizzing her on geopolitics.

The point of a Q&A, however it might ultimately be structured, is to provide a self-contained Turing test, attaching an element of classic, straightforward AI research to what might otherwise appear to be robotic stagecraft. First proposed by computer pioneer Alan Turing in 1950, the Turing test asks a machine to prove its intelligence by duping human judges, convincing them through a series of off-the-cuff responses that it’s as human as them.

Yet, the Turing Test is more science history than science, an approach that’s been abandoned by the vast majority of AI researchers. Every year, teams compete for the Loebner Prize, with money awarded to the most human-like chatbots. And every year, no one cares outside of the competition cares. “The Loebner Prize has been fun, but it’s a complete failure in terms of pushing the scientific agenda forward, and attracting wider public interest in AI,” says Goodman. “The Turing Test served an incredibly valuable role as a thought experiment. It’s not a viable research goal.”

Of all the Turing Test’s problems, one of the biggest has to do with humans. Judges are often thrown off by humor—not adaptive, creative bits, based on new information, but canned jokes, unleashed at tactical intervals to avoid having to craft a more relevant response. Humans, it turns out, are unreliable arbiters of Turing Tests. “This XPrize would have that same problem. As anyone who’s watched a lot of TED videos knows, they have an extremely striking and easy recognizable cadence and prosaic style,” says Goodman. “You could get pretty far by just programming in the jokes and particular stylistic tricks used to get standing ovations.”

For the record, Goodman isn’t as harsh of critic of this XPrize as I am—he’s genuinely excited to see what the organizers come up with. Kerstin Dautenhahn, a professor of artificial intelligence at the University of Hertfordshire, is less enthusiastic. “The main robotics challenges now and in the future are about interaction with the world, not about performance skills,” says Dautenhahn, whose work encompasses AI as well as human-robot interaction and social robotics.

To Dautenhahn, the primary challenges of AI-driven robots are in how they interact with their inanimate and social environments—navigating a room, for example, as well as navigating a conversation. “I can’t see any of these elements in the XPrize, so I’m not sure which useful skills the prize rules access—useful in the sense of advancing real-life AI,” she says.

It’s entirely possible that, once it’s more fully realized, the AI XPrize will be a more meaningful research endeavor, and less of a zany stunt. To Goodman, that shift could be as simple as forcing the AI to improvise—a key feature of human intelligence, and a persistent stumbling block for machines. Never mind the set of 100 topics, for example (a framework that begs for pre-recorded presentations). Instead, the system could be given a random topic on-site, allowed to compile data from the internet, and then deliver a presentation. That’s not possible right now,” says Goodman. “I could imagine it being possible in 5 years.”

Personally, I think the AI XPrize has bigger problems than its rules, notably its relationship with its sponsor. The Progressive Automotive XPrize didn’t ask teams to create cars with inherently low insurance premiums. The AI XPrize is remarkably TED-centric, taking place on TED turf, with success or failure measured against previous TED Talks. There’s no denying the reach of these conferences, and the attention that a TED-bot would garner. But when you’re asking computer scientists to solve for a standing ovation, it’s more carnival act than serious competition. For the first time, it might be possible to win an XPrize without winning anyone’s respect.