Home PC News DeepMind wants to teach robots to play board games

DeepMind wants to teach robots to play board games

Mastering physical systems with abstract goals is an unsolved challenge in AI. To encourage the development of techniques that might overcome it, researchers at DeepMind created custom scenarios for the physics engine MuJoCo that task an AI agent with coordinating perception, reasoning, and motor control over time. They believe that the library, which they’ve made publicly available, can help bridge the gap between abstract planning and embodied control.

Recent work in machine learning has led to algorithms capable of mastering board games such as Go, chess, and shogi. These algorithms observe the states of games and control these states directly with their actions, unlike humans, who don’t just reason about the moves but look at the board and physically manipulate the game pieces with their fingers. Beyond games, many problems in the real world require a combination of perception, planning, and execution, which even leading algorithms mostly fail to capture.

The team’s solution is a set of challenges that embed tasks from games (e.g., tic-tac-toe, Sokoban) into environments where agents must control a physical body to execute moves. For example, to place a single tic-tac-toe piece, an agent has to reach the board with a 9-degree-of-freedom arm and touch the corresponding place on that board. Learning to play tic-tac-toe and executing a reaching movement are well within the capabilities of current AI approaches, but most agents struggle when they’re faced with both problems at once.

DeepMind AI

In MuJoBan, which is based on Sokoban, an agent situated on a grid has to push boxes onto target locations. Only one box can be pushed at a time and boxes can only be pushed, not pulled. MuJoXo is akin to tic-tac-toe, with randomness to ensure pieces aren’t aligned perfectly on the board. The last game, MuJoGo, is a 7-by-7 Go board designed to be solved in roughly 50 moves (2.5 seconds).

In experiments, the researchers designed example agents to complete various game tasks. The agents employed a planner module to map ground truth game states to target states as well as plot out the actions needed to reach them. They also leveraged an auxiliary task to encourage agents to follow instructions, such that an agent received a reward when it executed actions that resulted in the game moves suggested by the instructions. (A “reward” refers to positive feedback that reinforces desirable behaviors — or game moves, as the case may be.)

The researchers report that the agents were unable to solve more than half of the levels in MuJoBan after extensive training, which they blame on a combination of multistep reasoning and control challenges. The simplest agent required around a million games before it could play MuJoXo “convincingly,” and it didn’t show any sign of progress in MuJoGo even after billions of steps of training.

“Problems that require reasoning and decision making over long time scales using sensoriomotor control cannot yet be solved in an end-to-end fashion. These problems arise frequently in human behavior but are still hard to frame and rarely studied in a controlled experimental setting,” the researchers wrote in a paper describing the work. “We hope that the environments provided here will spur research into how to coherently introduce these capacities into the next generation of AI agents.”

All three scenarios are available on GitHub.

Most Popular

SideQuest raises $650,000 for testing apps and games for Oculus Quest

Sideloading platform SideQuest is taking $650K in early investment from BoostVC, Oculus founder Palmer Luckey, and The Fund. The relatively small preseed funding will...

AI Weekly: Amazon went wide with Alexa; now it’s going deep

Amazon’s naked ambition to become part of everyone’s daily lives was on full display this week at its annual hardware event. It announced a...

Mass Effect remasters pushed into 2021 and Xbox buys Bethesda | GB Decides 165

Mass Effect: Legendary Edition is still coming, but not in 2020. GamesBeat reviews editor Mike Minotti and editor Jeff Grubb talk...

Allen Institute researchers find pervasive toxicity in popular language models

Researchers at the Allen Institute for AI have created a data set — RealToxicityPrompts — that attempts to elicit racist, sexist, or otherwise toxic...

Recent Comments