The DARPA Robotics Challenge Was A Bust

It’s been close to a month since the DARPA Robotics Challenge (DRC) wrapped up. That’s time enough to face facts. The biggest and most well-funded international robotics competition in years was a failure.

That doesn’t feel good to write. The DRC was a huge undertaking, spanning years and costing millions. The competition had a noble goal—the development of robots that can better respond to disasters—and it attracted many of the world’s smartest and most accomplished roboticists.

And I know I’m not speaking truth to power, by pointing out the disappointments of the DRC. Despite an increase in acquisitions, investments and recruiting in the last few years, robotics is a field comprised mostly of underdogs. If anything, I’m kicking sand in the faces of researchers who’ve spent nearly three years losing sleep, neglecting loved ones, and generally pouring their lives into building and programming machines that wound up looking almost universally unimpressive. Of the 24 robots that showed up to compete in last week’s DRC Finals in Pomona, California, only a few made it through the challenge course on their feet. As a result, the biggest news out of the DRC seems to be a parade of GIFs of robots falling. One bot fell so hard, its head popped off.

But don’t pity these bumbling robots. If the DRC hadn’t been so rife with slapstick, it would have put everyone to sleep. After all, the robot that racked up the most points, in the least amount of time, took nearly 45 minutes to complete a series of eight tasks that my kindergarten-age daughter could probably accomplish in 10 minutes.

The biggest and most well-funded international robotics competition in years was a failure.

And, all due to respect to my human offspring, that’s not a compliment. DRC-Hubo, the triumphant, 5-foot 9-inch humanoid robot fielded by South Korea’s Team KAIST, spent most of its time on the open-air challenge course—a long stretch of dirt leading to a mock facility about the size of a single-bedroom apartment—doing nothing, in one place or another. It started its competition-winning run strong, by driving about as well as a person might, and getting out of its modified Polaris utility vehicle more quickly, and with fewer awkward starts and stops, than any other robot at the event. Then DRC-Hubo dropped down onto its wheeled knees, rolled slowly up to the closed door that represented the entrance to the simulated facility, and froze. For long minutes the most capable robot in the DRC prepared for the daunting task of turning a handle, and pushing a door.

Eventually, it did both of those things, and the crowd in the stands at the Fairplex erupted in cheers. Anyone just arriving at the event—making their way up the escalator, or grabbing a hot dog before heading out to the seats—might have heard that roar and imagined robots vaulting over rubble, or bashing through a concrete wall. But the DRC’s threshold for cheer-worthy feats was considerably lower. Some robots never made it to that door. One humanoid model collapsed in the opening seconds of the competition, and kept falling until its team pulled it off the course. Another humanoid tipped over while exiting the Polaris, and damaged itself so grievously that it literally bled, leaving behind a pool of oil for DARPA staffers and team members to scrub away. And those machines who conquered the door had to contend with such hazards as a floor with a two-to-three degree slope, and a path obstructed by precisely eight pieces of debris. This was a contest whose entries were so incompetent, at least compared to humans, that simply opening that door counted as a legitimate victory.

This wasn’t the DRC that DARPA originally pitched. In April 2012, when the agency first outlined the scope and parameters of the competition, it seemed impossible. DARPA appears to have taken down its original news release, but this is how Virginia Tech described the event’s tasks, when it announced its own participating team. Pay special attention to the last of the eight proposed tasks:

If the DRC had included robots breaking through concrete walls, forget about cheering crowds. The Fairplex would have exploded, and the competition would have had television ratings that rivaled the Olympics. But the DRC Finals were a study in compromise, and while DARPA always warned that the competition’s rules would remain secret and mutable right up to the end, only one of those eight, proposed tasks remained intact. Here’s a task-by-task breakdown:

Getting into the vehicles was not part of the final event—teams could carefully load and position their robot before the clock started. And those vehicles weren’t “standard,” since all of the Polaris utility vehicles used in the DRC featured improved suspensions to support heavier loads, and all but one of those vehicles was modified to allow robots to drive and/or exit the vehicle.

The robots were required to egress, but there was no rubble between them and the door.

Never happened. Instead, for the seventh of the eight total tasks, robots had to reach the mock facility’s exit by either dealing with a stretch of floor obstructed by eight pieces of debris, or by traversing a path comprised of cinder blocks.

This is the only task that wasn’t explicitly or implicitly downgraded.

Though one of the tasks did involve rotating a circular valve a full 360 degrees, it was the only valve in the course. The notion that robots would be locating one valve out of many didn’t apply.

Reconnecting a hose sounds pretty cool, doesn’t it? Imagine the fine motor skills required to pull that off, and with little to no direct control from remote human operators, since DARPA also promised to degrade communication signals, and therefore demand more autonomy of its robot participants. Instead, there was a surprise task. On the first day, it was a big switch that had to pulled down. On the second day, it was a cable plugged into the wall, that had to be removed and plugged into another socket. But these were props, essentially, with no prongs to contend with. They were held in place with magnets, like an industrial-size version of a MacBook’s MagSafe power adapter.

Though not the most promising visual in the DRC, as originally described (that’s coming up next), robots climbing ladders would have been stunning. And the actual task of ascending a ladder would have been maybe the most technically challenging aspect of the competition, requiring a tremendous amount of strength in various components, and an unprecedented combination of manipulation (to grasp the rungs) and limbed mobility.

Instead, the DRC’s final task was to climb a total of four stairs.

Try to picture this happening. A humanoid robot picks up a tool—teams initially assumed it would be a Sawzall—buzzes through a wall, and leaves the course through a hole made with its own autonomous brains and mechanical might. This was going to be the showstopper. Humans would have run in terror, wept with joy, or at least paid attention to the biggest robotic competition that the world has seen since DARPA’s last, historic robot contest, the Urban Challenge, a driverless car race held in 2007.

But in the DRC that we actually got, the minority of robots who survived long enough to reach the power tools, which were screwdrivers, not saws, had to carve a small hole in a wall (as indicated by a circle) and then move on. The resulting holes would have been big enough for a frightened cat to scramble through, or for a trapped human to stick his or her head out, and yell at the robot that’s slowly—ever so slowly—inching towards the exit.

* * *

There were signs along the way that DARPA’s experiment had fizzled. At the DRC Trials, held in Miami in 2013, robots were given up to 30 minutes for each of their eight tasks. Most used at least of half of that alotted time, making for runs that felt endless, and were only possible because of their attached power cords. More worrying still, some robots simply skipped the trickier tasks, such as driving. And during a telephone briefing this past March, DRC program director Gill Pratt stated that, during the finals, robots would not have to get into vehicles on their own. He also mentioned that, as in the trials, some teams might opt to forfeit the points associated with driving.

But Pratt spoke at length during that call about the issue of falling. “If they do fall down, they’re going to have to get up on their own,” he said. In the DRC Trials, robots had safety belays that prevented them from hitting the ground. But those cords were being cut for the finals. “We’re trying to make this contest more authentic, to what a real disaster would be like, where of course human beings couldn’t suddenly go in to rescue the robot in a disaster zone,” Pratt said.

When pressed further about how falling would be handled in the competition’s final stage, Pratt went on:

Pratt cited CMU’s CHIMP as an example of a robot that, by design, essentially could not fall. The 443-pound machine is statically stable, meaning that, unlike the systems that used bipedal walking to get around, it doesn’t have to actively maintain balance. Even if CHIMP were to unexpectedly power down, it wouldn’t topple. “And for those teams who don’t have to worry about that, well, maybe you made the right choice,” Pratt added.

The more Pratt talked about falling, however, the less punitive it sounded. Despite initially saying that the robots would have to get up on their own, he later conceded that teams might be able to put it back on its feet, and simply take a time penalty. This would roughly simulate a situation where disaster responders have more than one robot to work with, and would be able to deploy a backup system should the first one go down.

There were signs along the way that DARPA’s experiment had fizzled.

Still, Pratt was enthusiastic about the prospect of seeing robots rise from the ground from falls, noting that, for the bots that advanced to the finals without competing in the 2013 trials, it was almost mandatory. “Having a machine get up from prone was one of the qualification tasks that all of the new teams have done. That said, we did not push the teams too far, to have to demonstrate that they can also survive a fall. It’s not just the fact that you can get up from lying down, you also have to be able to not get hurt when you fall down,” Pratt said. “I haven’t seen very much of that yet. It will be neat to see which teams can pull that off and which can’t.”

Pratt was right: For the majority of the robots in the DRC Finals, falling was a certainty. But his prescient advice, to practice falling and recovering before the competition, was blatantly ignored. None of the teams rehearsed that scenario in full, without a safety tether, prior to showing up in Pomona. And though the media was invited, and then disinvited to attend the pre-event test runs, where DARPA could assess the overall capability of the robots, whatever the organizers saw there convinced them to hobble the competition yet again.

Robots were not forced to get up on their own. Nearly every team whose machine tumbled simply ate the 10 minute time penalty. Some did so multiple times, implying a scenario where responders bring an entire squad of identical, blundering bots to a disaster, knowing full well that they’re liable to faceplant while facing such harrowing obstacles as a door handle, or a handful of stairs. As for the late-addition teams for whom getting up from prone was a mandatory requirement, that capability was MIA at the competition. When robots hit the ground at the DRC, which was constantly, they didn’t get up. They either lay there like corpses, or continued whatever movement they were engaged in before the seemingly inevitable loss of balance. As the falls kept coming, the state of humanoid robotics was exposed, in all its disappointing fragility. The already unfortunate impression that we were watching cybernetic hybrids of ungainly toddlers and disoriented seniors became, against all odds, even worse, when these would-be disaster responders waited for a bunch of humans to hoist them upright. Time and again, the spectators watched team members and DARPA officials struggling with cables and gantries, and putting their backs into this effort—remember, most of the these bots weighed between 200 and 400 pounds—while the powered-down robots did nothing.

The sole exception was CHIMP, one of the robots that was supposedly incapable of falling. To everyone’s surprise, it did, having left its arms extended forward after opening the door, and encountering that two-to-three-degree disparity between the ground outside and inside of the mock facility. The robot’s center of gravity was momentarily non-optimal. Down went CHIMP.

What followed were the most suspenseful minutes in the entire competition, as the bot—already a fan-favorite, thanks to its striking, primate-inspired design, and attention from media outlets (ours included) in the run-up to the finals—struggled to get up. It wasn’t pretty, and it wasn’t fast. A reset, and corresponding time penalty, would have scuttled the promising robot’s chances of completing a prize-winning run. And with each failed, and undignified bout of writhing, it seemed more likely that CMU would join that sad DRC tradition, of treating its robot like a mechanical invalid. So when CHIMP finally got itself onto all fours, and rolled back into the competition, it was a triumph.

Or it seemed like it, at the time. There’s no denying that CHIMP’s eventual third-place finish was due to world-class engineering, the deep bench of robotics talent that CMU has developed, and some unquantifiable amount of grit and gumption on the part of the team members. They didn’t surrender to gravity. They did what no other team at the DRC even attempted. They got up! But it’s possible to succeed at a competition that itself is a failure.

The DRC was a bust not because its participants were lazy, or unequal to the task. That wasn’t the case for any of the teams that made it to the finals, or even to the trials in Miami. It was a failure on its own terms, which shifted over the course of the competition. The DRC that was presented in 2012 was an astonishing vision of what robots might be capable of, in just a few years. The DRC that wrapped up in 2015 was a reality check. Worst of all, it failed to catch or at least hold the attention of the general public. The event was webcast, but not televised. News outlets with a focus on tech and science covered the finals, but not enough to pique the interest of more mainstream media. Years of work and tens of millions of funding culminated in an event that no one appeared to care about, despite the fact that it featured walking, driving, tool-grabbing humanoid robots.

This is a shamelessly unscientific survey, but no one among my family or friends knew that the DRC Finals were happening. That includes my brother, who works at NASA, and my 11-year-old nephew, who’s in an engineering program sponsored by defense contractor Lockheed Martin, the defense firm that had a team in the DRC. When the competition was over, the only person in my social circle—which is rife with nerds—who had read or seen anything about it was my father-in-law. His sole takeaway: the Pentagon held a contest where a bunch of robots fell down.

Sadly, he nailed it. The DRC was a high-stakes, and extremely well-funded robotics competition that turned out really silly. Its optics are those of feeble robots face-planting for no discernible reason, rather than heroic (or hellish) machines smashing through concrete walls. Even in their severely diminished states, the tasks were too much for all but a handful of bots to accomplished. When DARPA’s Urban Challenge wrapped up in 2007, and driverless cars suddenly seemed like an inevitability, the world was watching. Cars bad driven through a mock city with no one behind the wheel or controlling them remotely, and only a couple had managed to crash. It was a worldwide revelation. This time, the best DARPA could hope for was that someone paid enough attention to the DRC to ridicule it.

But DARPA has been here before. Before the axis-tilting success of the Urban Challenge, there was the agency’s first Grand Challenge. In 2004, twenty-five autonomous vehicles took on a 150-mile-long course in the Mojave desert. “Every car failed, and the closest was CMU, which got seven and a half miles in and then hit a boulder,” says Boris Sofman, CEO of Anki, a San Francisco-based artificial intelligence startup that makes autonomous toy cars. Sofman, who earned his PhD at CMU’s Robotics Institute, made the connection that I wish I could take credit for: the DRC was unimpressive compared to the Urban Challenge, but the better comparison is that first Grand Challenge. “But just three years later, they had autonomous cars driving in an urban setting, with moving vehicles, following traffic laws,” says Sofman. “And less than 10 years later we have truly autonomous cars that are already at a functionality that’s better than humans, in a lot of the roads they’re being tested in.”

DARPA’s response to the 2004 Grand Challenge was bold. With the first event’s prize money (including $1M for first place) unclaimed, the agency scheduled another desert race for 2005. But the second Grand Challenge was more than a simple do-over. It was, in some respects, harder than the first event, featuring more obstacles, and various tunnels, which are a potential stumbling block for sensors. The competitors rose to the challenge. Five vehicles finished the course, prizes were awarded, and DARPA followed it up relatively quickly, with the yet more difficult Urban Challenge.

It’s painful to consider that the DRC might be the spiritual successor to that first Grand Challenge. But that initial dud in the Mojave desert was the foundation for the robot car competitions that came later, and for the rapid, and stunning pace of innovation in driverless vehicles within the commercial sector. If DARPA is serious about pushing the development of robots that could respond to disasters, or at least navigate and function within human environments without making dangerous fools of themselves, the first step is to recognize failure. The DRC was a bust. Now what?

My proposal isn’t humble, but it’s simple: Hold another competition, and make falling mandatory. Gill Pratt knew that falls would be the rule, not the exception, and no one listened to him. It’s no coincidence that Team KAIST, the South Korean team that won the DRC’s top prize of $2M, was one of the few in the finals whose bot stayed upright. That’s fortunate, since, according to team leader JunHo Oh, KAIST never bothered coming up with a strategy for getting up from a fall. Instead, they simply proceeded with an abundance of caution. The result was a performance that was good enough to win the finals, but if you applied the robot’s timid, halting, 45-minute slog to a real-life disaster, it’s hard to imagine it accomplishing anything useful. And shouldn’t a machine that’s designed to charge into an emergency be both capable of moving with some measure of speed, and capable of surviving the sort of stumble that any human responder would easily recover from?

In a follow-up to the DRC where robots are required to fall, durability would be a priority, just as it would be in a deployed system. Roboticists would have to keep impact and redundancy in mind when they bought or built actuators. Despite Pratt’s best intentions, the DRC wound up incentivizing fragile designs. Shouldn’t disaster bots be among the sturdiest? And the DRC’s most indelible and embarrassing visual, of robot after robot tipping over, ramrod straight, like a felled tree, could be supplanted by the more inspirational optics of machines getting back up.

New tasks, or even ones that are closer in difficulty to the original versions presented by DARPA, would also be exciting. But those are details for people much smarter than me to work out. If there’s a single lesson from the DRC, it’s that humanoid robots are falling robots. Also, that the road to humanoids that aren’t so clumsy will be long, and strewn with shattered components. If DARPA doesn’t hold another version of the DRC, then the first one will have been little more than a grim status update, and a self-contained failure, for the few of us that realized it happened. But if the next challenge is tougher, and more realistic, this past competition will be as history-making as so many of us wanted it to be.