Minimax Sample Complexity for Turn-Based Stochastic Game

Abstract

The empirical success of multi-agent reinforcement learning is encouraging, while few theoretical guarantees have been revealed. In this work, we prove that the plug-in solver approach, probably the most natural reinforcement learning algorithm, achieves minimax sample complexity for turn-based stochastic game (TBSG). Specifically, we perform planning in an empirical TBSG by utilizing a ‘simulator’ that allows sampling from arbitrary state-action pair. We show that the empirical Nash equilibrium strategy is an approximate Nash equilibrium strategy in the true TBSG and give both problem-dependent and problem-independent bound. We develop reward perturbation techniques to tackle the non-stationarity in the game and Taylor-expansion-type analysis to improve the dependence on approximation error. With these novel techniques, we prove the minimax sample complexity of turn-based stochastic game.

Cite

Text

Cui and Yang. "Minimax Sample Complexity for Turn-Based Stochastic Game." Uncertainty in Artificial Intelligence, 2021.

Markdown

[Cui and Yang. "Minimax Sample Complexity for Turn-Based Stochastic Game." Uncertainty in Artificial Intelligence, 2021.](https://mlanthology.org/uai/2021/cui2021uai-minimax/)

BibTeX

@inproceedings{cui2021uai-minimax,
  title     = {{Minimax Sample Complexity for Turn-Based Stochastic Game}},
  author    = {Cui, Qiwen and Yang, Lin F.},
  booktitle = {Uncertainty in Artificial Intelligence},
  year      = {2021},
  pages     = {1496-1504},
  volume    = {161},
  url       = {https://mlanthology.org/uai/2021/cui2021uai-minimax/}
}