ALPHAGO DESIGN. “These were trained to mimic the moves of the best human players, to reward wins and, using a probability distribution, to limit the outcomes for any board position to a single verdict: win or lose.”
http://www.nature.com/news/digital-intuition-1.19230