Why DARPA wants its robots to think like kids

If machines can learn as adeptly as children do, the military thinks they might be able to help out in useful ways.

By Kelsey D. Atherton

Posted on Jun 29, 2022

Good boys. DARPA

Learning to walk is about learning from failure, and then trying again. Each new footstep carries with it a weight imbalance, and the possibility that the ground underneath will be a different texture or density than what it was before. A toddler who learns to walk while on the rocking deck of a ship may struggle when, on dry land, the ground under foot doesn’t move in the expected way.

For DARPA, the military’s blue sky projects wing, teaching robots to walk on new terrain means embracing learning like a toddler. Learning to walk starts with a lot of failure, but by learning how to adjust to failure, robots could tackle wholly new environments based on intuition alone.

This is the domain of Machine Common Sense, a DARPA initiative about developing a kind of AI that allows robots, first in simulation and then in the real world, to emulate a toddler’s ability to understand, interact with, and navigate through the world. It includes efforts on processing language, manipulating objects, and moving across unfamiliar terrain.

“The inspiration for the program was that although AI has produced many very stunning systems that have shown expert level performance on many tasks, in general AI systems are brittle and tend to lack the common sense that any person in the street would have,” says Howard Shrobe, DARPA’s manager for Machine Common Sense.

“The ultimate goal, though, is to enable computer systems and robotic systems to be able to be trained in much the same way that we train soldiers in technical areas that they work on within the military,” says Shrobe.

Just read the instructions

Shrobe imagines how useful it could be if machines could learn as adeptly as humans do. “Way down the road you could imagine a robot technician reading the instruction manual for how to do some motor repair on a vehicle and being able to take that language description and possibly some videos, and then just execute it, because it’s perfectly capable of figuring out how to take the motions it already knows how to do and compose them to do the things that are implied by the instructions,” adds Shrobe.

For that to work, an AI not only has to absorb a manual and be able to repeat the information it contains, but the AI would have to discern all the integral knowledge that’s not explicitly stated in the instructions, but vital to the process anyway.

[Related: Google taught a robot dog new tricks by having it mimic the real thing]

“You can imagine a recipe for making scrambled eggs, and it might start off by saying ‘put two eggs in a bowl.’ And even those of us who were pretty bad cooks understand that it didn’t literally mean put two eggs in a bowl. It meant crack the eggs and put ’em in the bowl. And it didn’t tell you where you would likely be able to find eggs or tell you how to crack them,” says Shrobe.

Cookbooks, like other instruction manuals, operate from the premise that a person opening the book already knows this kind of implicit information, so that the reader can focus on the task at hand. If military machines can be built with AI that can discern this common sense from reading, then the AI can perform specialized tasks without having to first be taught how to learn all the component parts of the task.

Can machines learn object permanence?

Developing common sense for machines means revisiting how artificial systems perceive, incorporate, and adapt to new knowledge. Some of this is knowledge of how bodies work and exist in space, like walking over new and uneven terrain. Another part of this is teaching an image recognition program to have object permanence, so that if a camera sees a ball roll behind a wall it does not catalog the ball as a new object when it emerges from the other side. This is the kind of knowledge that comes intuitively to humans, though often through some trial and error, in infancy.

For machines learning physical tasks, that knowledge can be acquired not by reading manuals, but by performing and adapting with programming capable of taking unexpected changes and turning it into knowledge. One example is a four-legged robot learning to maintain balance even as weights are thrown onto its back.

The machine common sense needed to navigate both manuals and obstacles is built on the same process of machine learning and deep reinforcement learning. In simulated environments the AI understands the parameters of the task set before it, and comes up with approaches for how to proceed when given new information. This means drawing from accumulated experience and attempting a strategy that approximates the present situation. And, crucially, then learning as the AI attempts to navigate the task in real life.

In simulation, a robot might walk over a hill, then stumble over some cinder blocks on the other side of it. In real life, that robot may summit a hill, and then encounter a fallen log. Thanks to stumbling in simulation, the robot could navigate the similar situation without tripping up.

[Related: A new tail accessory propels this robot dog across streams]

By learning what doesn’t work, and more importantly by learning and repeating what does work, the AI that DARPA is working to develop will navigate a machine through familiar tasks within unfamiliar environments. While Shrobe speaks of infants, it’s also the kind of general adaptation to the world that we expect from adults, especially the young adults who enlist in the military and are then expected to master tasks learned in training in countries across the world they may not have even heard of before arriving.

While the fully capable robot technician Shrobe imagined is still a long way off, through DARPA’s Machine Common Sense program teams are working on developing and evaluating the component steps. This means not just repeating the text in a manual, or proving a robot can walk on uneven ground, but also testing to see if the AI can produce coherent next sentences in a language test, or if the robot can walk over uneven ground that suddenly becomes slick with oil.

One clean example of all this is training an AI in simulation to pass the same kind of tests for children to see if they’ve developed that aforementioned idea of object permanence.

“You show an object rolling behind a screen, and then it never comes out. And you can now ask the AI system, can you find the object? And if it navigates in the simulated environment to go behind the screen, then you can assume it has object permanence, because it assumes the thing rolled behind the screen and stayed there, as opposed to it rolling behind the screen and stopping existing,” says Shrobe.

Watch the AI direct a four-legged robot over terrain below: