Minimax Sample Complexity for Turn-Based Stochastic Game
Abstract
The empirical success of multi-agent reinforcement learning is encouraging, while few theoretical guarantees have been revealed. In this work, we prove that the plug-in solver approach, probably the most natural reinforcement learning algorithm, achieves minimax sample complexity for turn-based stochastic game (TBSG). Specifically, we perform planning in an empirical TBSG by utilizing a ‘simulator’ that allows sampling from arbitrary state-action pair. We show that the empirical Nash equilibrium strategy is an approximate Nash equilibrium strategy in the true TBSG and give both problem-dependent and problem-independent bound. We develop reward perturbation techniques to tackle the non-stationarity in the game and Taylor-expansion-type analysis to improve the dependence on approximation error. With these novel techniques, we prove the minimax sample complexity of turn-based stochastic game.
Cite
Text
Cui and Yang. "Minimax Sample Complexity for Turn-Based Stochastic Game." Uncertainty in Artificial Intelligence, 2021.Markdown
[Cui and Yang. "Minimax Sample Complexity for Turn-Based Stochastic Game." Uncertainty in Artificial Intelligence, 2021.](https://mlanthology.org/uai/2021/cui2021uai-minimax/)BibTeX
@inproceedings{cui2021uai-minimax,
title = {{Minimax Sample Complexity for Turn-Based Stochastic Game}},
author = {Cui, Qiwen and Yang, Lin F.},
booktitle = {Uncertainty in Artificial Intelligence},
year = {2021},
pages = {1496-1504},
volume = {161},
url = {https://mlanthology.org/uai/2021/cui2021uai-minimax/}
}