Computers are better than humans at a lot of things but there are numerous problem spaces where they struggle. Anything with complex branching or large numbers of possibilities forces them into costly jumps, negating the benefits of their ability to think in microsecond increments. This is why it took computers so long from beating humans at something like tic-tac-toe, a computationally simple game, to beating humans at chess. However one game has proven elusive to even the most cutting edge AI developers, the seemingly simple game Go. This is because unlike chess or other games, which often rely on brute forcing out many possible moves and calculating the best one, Go has an incomprehensibly large number of possible moves making such an approach near impossible. However Google’s DeepMind AI, using their AlphaGo algorithms, has successfully defeated the top European player and will soon face its toughest challenge yet.
Unlike previous game playing AIs, which often relied on calculating board scores of potential moves, AlphaGo is a neural network that’s undergone whats called supervised learning. Essentially they’ve taken professional level Go games and fed their moves into a neural network. Then it’s told which outcomes lead to success and which ones don’t, allowing the neural network to develop it’s own pattern recognition for winning moves. This isn’t what let them beat a top Go player however as supervised learning is a well established principle in the development of neural networks. Their secret sauce appears to be a combination of an algorithm called Monte Carlo Tree Search (MCTS) and the fact that they pitted the AI against itself in order for it to get better.
MCTS is a very interesting idea, one that’s broadly applicable to games with a finite set of moves or those with set limits on play. Essentially what a MCTS will do is select moves at random and play them out until they’re finished. Then, when the outcome of that play out is determined, the moves made are then used to adjust the weightings of how successful those potential moves were. This, in essence, allows you to determine what set of moves are most optimal by refining down the problem space to what is the most ideal set. Of course the tradeoff here is between how long and deep you want the network to search and how long you have to decide to make a move.
This is where the millions of games that AlphaGo played against itself comes into play as it allowed the both the neural networks and the MCTS algorithm to be greatly refined. In their single machine tests it only lost to other Go programs once out of almost 500 games. In the match played against Fan Hui however he was matched against a veritable army of hardware, some 170 GPUs and 1200 CPUs. That should give you some indication of just how complex Go is and what it’s taken to get to this point.
AlphaGo’s biggest challenge is ahead of it though as it prepares to face down the current top Go player of the last decade, Lee Sedol. In terms of opponents Lee is an order of magnitude higher being a 9th Dan to Fan’s 2nd Dan. How they structure the matches and their infrastructure to support AlphaGo will be incredibly interesting but whether or not it will come out victorious is anyone’s guess.