Finally, A Computer Beat a Human Champion in the Game of Go

For the past 40 years, programmers have been trying to train their computers on this deceptively-complex game.

by Jessie Guy-Ryan March 13, 2016

A Go board and pieces. (Photo: Liz West/CC BY-2.0).

If you aren’t familiar with Go, it may seem like a simple game. There’s only one kind of piece—a round stone, either white or black—and the only move is placing the stones on a board to meet the straightforward aim of controlling the most territory. In fact, the game is incredibly complex, with the number of possible Go games estimated at 10^170. In the 1979 novel Shibumi, the author Trevanian wrote, “Go is to Western chess what philosophy is to double entry accounting.”

This is why programmers have spent the past 40 years teaching a computer to play Go. This weekend, the Google DeepMind team made a breakthrough when their AlphaGo program won a five-game match against champion Go player Lee Se-dol.

It’s an astounding milestone in the long history of Computer Go programming, with huge implications for future research.

Artificial intelligence programmers have wrestled with the challenge of designing a Go computer program since the 1960s. Go presents numerous challenges: the 19x19 board is very large, there’s an incalculable number of possible moves to evaluate, and evaluation itself is tricky—many moves may be equally good choices or may have broader significance further along in the game. A February article on DeepMind’s Demis Hassabis in The Guardian summarizes why teaching a computer to evaluate potential moves is so hard:

Its branching factor is huge: it has more possible moves than there are atoms in the universe; and, unlike chess, it can’t be figured out by brute calculation. Intractable, it is also impossible to write an evaluation function, ie a set of rules that tell you who is winning a position and by how much. Instead it demands something akin to “intuition” from its players: when asked why they made a certain move, professionals often say something along the lines of: “It felt right.”

The level of difficulty hasn’t kept computer scientists from trying, even before computers achieved the power and complexity they enjoy today. Albert Zobrist put forth one of the earliest attempts at developing a method for computer Go in 1970. In 1976, David Benson published Benson’s algorithm, which provided a way for programs to determine which pieces were “unconditionally alive” (that is, completely safe from capture by the opponent), and a 1981 issue of Byte magazine detailed a Go program named Wally that could play at a beginner level. But even with these incremental advances, the Go programs still had a basic problem: they couldn’t perform the complex evaluations needed to play like an advanced player in a reasonable timeframe.

As computer chess programs grew more and more skilled, with Deep Blue beating the highest-ranked player in history, Garry Kasparov, in 1997, Go programs appeared stuck. In a 2007 article for IEEE Spectrum, researcher Feng-Hsiung Hsu lamented, “Ten years later, the best Go programs still can’t beat good human players.”

It appears the key to improving Go programs—to replicating that elusive concept of “intuiting” the best possible move—lie in the newer AI techniques of neural networks and deep learning. As the AlphaGo team explained in Nature, the program uses two kinds of neural networks to evaluate the board; “value networks” analyze the board positions while “policy networks” select moves. To train the networks, AlphaGo uses deep learning, ingesting data from actual human-played Go games, as well as simulating thousands of games against a modified version of itself. The game data teaches AlphaGo’s “policy network” how to select a few potential moves for each turn, and the “value network” plays each possible move a few steps ahead to determine which move is most likely to be successful.

On its face, it sounds a lot like how a human might think through selecting a move in a game of Go.

While AlphaGo has achieved a never-before-seen level of play, previously thought to be at least a decade away from becoming reality, it’s by no means perfect. Last night, Lee Se-dol managed to beat AlphaGo after the program won three games in a row. Based on Demis Hassabis’ live-tweeting of the game, AlphaGo made a mistake on move 79, but didn’t “realize” it until move 87, when, Demis explains, the value network’s output plunged. Again, a disarmingly human-like situation for the program to find itself in.

Lee Sedol is playing brilliantly! #AlphaGo thought it was doing well, but got confused on move 87. We are in trouble now…
— Demis Hassabis (@demishassabis) March 13, 2016

Mistakes aside, the importance of AlphaGo’s win cannot be overstated. Aside from being the first time a Go program has defeated an 18-time world champion, AlphaGo’s deep learning techniques have huge implications for machine learning and AI development. The Verge explains the larger impact:

DeepMind, however, believes that the principles it uses in AlphaGo have broader applications than just Go. Hassabis makes a distinction between “narrow” AIs like Deep Blue and artificial “general” intelligence (AGI), the latter being more flexible and adaptive. Ultimately the Google unit thinks its machine learning techniques will be useful in robotics, smartphone assistant systems, and healthcare; last month DeepMind announced that it had struck a deal with the UK’s National Health Service.

Although AlphaGo represents great potential for machine learning, Hassabis tells The Economist that general machine intelligence is probably still a long way off. But this weekend’s event shows us how games can be used to measure our progress in building artificial intelligence, and as DeepMind moves on to card games, Vegas can’t be far away.