Monte-Carlo Tree Search Using Batch Value of Perfect Information

Abstract

This paper focuses on the selection phase of Monte-Carlo Tree Search (MCTS). We define batch value of perfect information (BVPI) in game trees as a generalization of value of computation as proposed by Russell and Wefald, and use it for selecting nodes to sample in MCTS. We show that computing the BVPI is NP-hard, but it can be approximated in polynomial time. In addition, we pro- pose methods that intelligently find sets of fringe nodes with high BVPI, and quickly select nodes to sample from these sets. We apply our new BVPI methods to partial game trees, both in a stand-alone set of tests, and as a component of a full MCTS algorithm. Empirical results show that our BVPI methods outperform existing node- selection methods for MCTS in different scenarios.

Cite

Text

Shperberg et al. "Monte-Carlo Tree Search Using Batch Value of Perfect Information." Conference on Uncertainty in Artificial Intelligence, 2017.

Markdown

[Shperberg et al. "Monte-Carlo Tree Search Using Batch Value of Perfect Information." Conference on Uncertainty in Artificial Intelligence, 2017.](https://mlanthology.org/uai/2017/shperberg2017uai-monte/)

BibTeX

@inproceedings{shperberg2017uai-monte,
  title     = {{Monte-Carlo Tree Search Using Batch Value of Perfect Information}},
  author    = {Shperberg, Shahaf S. and Shimony, Solomon Eyal and Felner, Ariel},
  booktitle = {Conference on Uncertainty in Artificial Intelligence},
  year      = {2017},
  url       = {https://mlanthology.org/uai/2017/shperberg2017uai-monte/}
}