On Strength Adjustment for MCTS-Based Programs
Abstract
This paper proposes an approach to strength adjustment for MCTS-based game-playing programs. In this approach, we use a softmax policy with a strength index z to choose moves. Most importantly, we filter low quality moves by excluding those that have a lower simulation count than a pre-defined threshold ratio of the maximum simulation count. We perform a theoretical analysis, reaching the result that the adjusted policy is guaranteed to choose moves exceeding a lower bound in strength by using a threshold ratio. The approach is applied to the Go program ELF OpenGo. The experiment results show that z is highly correlated to the empirical strength; namely, given a threshold ratio 0.1, z is linearly related to the Elo rating with regression error 47.95 Elo where −2≤z ≤2. Meanwhile, the covered strength range is about 800 Elo ratings in the interval of z in [−2,2]. With the ease of strength adjustment using z, we present two methods to adjust strength and predict opponents’ strengths dynamically. To our knowledge, this result is state-of-the-art in terms of the range of strengths in Elo rating while maintaining a controllable relationship between the strength and a strength index.
Cite
Text
Wu et al. "On Strength Adjustment for MCTS-Based Programs." AAAI Conference on Artificial Intelligence, 2019. doi:10.1609/AAAI.V33I01.33011222Markdown
[Wu et al. "On Strength Adjustment for MCTS-Based Programs." AAAI Conference on Artificial Intelligence, 2019.](https://mlanthology.org/aaai/2019/wu2019aaai-strength/) doi:10.1609/AAAI.V33I01.33011222BibTeX
@inproceedings{wu2019aaai-strength,
title = {{On Strength Adjustment for MCTS-Based Programs}},
author = {Wu, I-Chen and Wu, Ti-Rong and Liu, An-Jen and Guei, Hung and Wei, Tinghan},
booktitle = {AAAI Conference on Artificial Intelligence},
year = {2019},
pages = {1222-1229},
doi = {10.1609/AAAI.V33I01.33011222},
url = {https://mlanthology.org/aaai/2019/wu2019aaai-strength/}
}