Why researchers are teaching AI to play Minecraft

OpenAI created the most trained Minecraft AI using more than 70,000 hours of gameplay.
Nuclear power plant model made in Minecraft
An artist's approximation of a nuclear power plant model made in Minecraft. Planet Minecraft

We may earn revenue from the products available on this page and participate in affiliate programs. Learn more ›

Nuclear fusion and Minecraft may have more in common than all the countless hours you can invest in them. As MIT Technology Review reported over the weekend, the artificial intelligence non-profit OpenAI recently built the world’s most advanced Minecraft-playing bot by analyzing over 70,000 hours of human gameplay via a new training method. While currently relegated to crafting pixelated tools and buildings, researchers claim bot’s achievements may one day help usher in breakthrough technologies like true self-driving vehicles and virtually unlimited renewable energy resources.

In order to design the first bot capable of constructing “diamond tools,” Minecraft‘s in-game items that on average takes humans about 20 minutes and 24,000 actions to craft, researchers utilized a technique known as imitation learning. As its name implies, imitation learning requires an AI to watch and improve upon thousands of human input examples to achieve its intended outcomes. Reinforcement learning, another popular and effective AI design method, instead centers on unfocused trial-and-error approach to its education.

[Related: This agile robot dog uses a video camera in place of senses.]

A major previous issue with imitation learning is that it normally requires researchers to hand-label “each step,” explains Technology Review, i.e. “doing this action makes this happen, doing that action makes that happen, and so on.” OpenAI managed to sidestep this immensely time consuming process through constructing an entirely separate neural network capable of handling the labelling procedure in what it dubs Video Pre-Training (VPT). Researchers first hired gig workers to play Minecraft, then recorded 2,000 hours of their keyboard strokes, mouse clicks, and video gameplay to use as reference for a subsequent AI bot’s training.

Using the addition of VPT, the new AI program could construct items in Minecraft previously unattainable to bots reliant only on reinforcement learning, such as the estimated 970-step process for building a table from crafted planks. When imitation and reinforcement learning were combined, the bot could handle construction projects involving over 20,000 consecutive actions.

[Related: An AI that lets cars communicate might reduce traffic jams.]

Although many years away, previous reinforcement learning accomplishments such as aiding in nuclear fusion research and self-driving advancements could potentially benefit from additional support from imitation learning gains first on display via video games like Minecraft. Until then, ethical issues abound within what data troves are utilized in methods like imitation and reinforcement learning, and how effectively they can be applied.

OpenAI was co-founded in 2015 by a team including Elon Musk and Sam Altman, and counted Peter Thiel as an initial investor. Musk stepped down from the board of directors in 2018.

We’ve reached out to OpenAI for clarification on where it gathered its 70,000 hours of Minecraft playthroughs, as well as if the videos’ authors are aware of the usage, and will update accordingly.