Three-Head Neural Network Architecture for Monte Carlo Tree Search

IJCAI 2018 pp. 3762-3768

doi:10.24963/IJCAI.2018/523 /ijcai/2018/gao2018ijcai-three/

Abstract

AlphaGo Zero pioneered the concept of two-head neural networks in Monte Carlo Tree Search (MCTS), where the policy output is used for prior action probability and the state-value estimate is used for leaf node evaluation. We propose a three-head neural net architecture with policy, state- and action-value outputs, which could lead to more efficient MCTS since neural leaf estimate can still be back-propagated in tree with delayed node expansion and evaluation. To effectively train the newly introduced action-value head on the same game dataset as for two-head nets, we exploit the optimal relations between parent and children nodes for data augmentation and regularization. In our experiments for the game of Hex, the action-value head learning achieves similar error as the state-value prediction of a two-head architecture. The resulting neural net models are then combined with the same Policy Value MCTS (PV-MCTS) implementation. We show that, due to more efficient use of neural net evaluations, PV-MCTS with three-head neural nets consistently performs better than the two-head ones, significantly outplaying the state-of-the-art player MoHex-CNN.

PDF IJCAI Semantic Scholar

Cite

Text

Gao et al. "Three-Head Neural Network Architecture for Monte Carlo Tree Search." International Joint Conference on Artificial Intelligence, 2018. doi:10.24963/IJCAI.2018/523

Markdown

[Gao et al. "Three-Head Neural Network Architecture for Monte Carlo Tree Search." International Joint Conference on Artificial Intelligence, 2018.](https://mlanthology.org/ijcai/2018/gao2018ijcai-three/) doi:10.24963/IJCAI.2018/523

BibTeX

@inproceedings{gao2018ijcai-three,
  title     = {{Three-Head Neural Network Architecture for Monte Carlo Tree Search}},
  author    = {Gao, Chao and Müller, Martin and Hayward, Ryan},
  booktitle = {International Joint Conference on Artificial Intelligence},
  year      = {2018},
  pages     = {3762-3768},
  doi       = {10.24963/IJCAI.2018/523},
  url       = {https://mlanthology.org/ijcai/2018/gao2018ijcai-three/}
}