Smooth UCT Search in Computer Poker
Abstract
Self-play Monte Carlo Tree Search (MCTS) has been successful in many perfect-information two-player games. Although these methods have been extended to imperfect-information games, so far they have not achieved the same level of practical success or theoretical convergence guarantees as competing methods. In this paper we introduce Smooth UCT, a variant of the established Upper Confidence Bounds Applied to Trees (UCT) algorithm. Smooth UCT agents mix in their average policy during self-play and the resulting planning process resembles game-theoretic fictitious play. When applied to Kuhn and Leduc poker, Smooth UCT approached a Nash equilibrium, whereas UCT diverged. In addition, Smooth UCT outperformed UCT in Limit Texas Hold'em and won 3 silver medals in the 2014 Annual Computer Poker Competition.
Cite
Text
Heinrich and Silver. "Smooth UCT Search in Computer Poker." International Joint Conference on Artificial Intelligence, 2015.Markdown
[Heinrich and Silver. "Smooth UCT Search in Computer Poker." International Joint Conference on Artificial Intelligence, 2015.](https://mlanthology.org/ijcai/2015/heinrich2015ijcai-smooth/)BibTeX
@inproceedings{heinrich2015ijcai-smooth,
title = {{Smooth UCT Search in Computer Poker}},
author = {Heinrich, Johannes and Silver, David},
booktitle = {International Joint Conference on Artificial Intelligence},
year = {2015},
pages = {554-560},
url = {https://mlanthology.org/ijcai/2015/heinrich2015ijcai-smooth/}
}