Learning in Nonzero-Sum Stochastic Games with Potentials

Abstract

Multi-agent reinforcement learning (MARL) has become effective in tackling discrete cooperative game scenarios. However, MARL has yet to penetrate settings beyond those modelled by team and zero-sum games, confining it to a small subset of multi-agent systems. In this paper, we introduce a new generation of MARL learners that can handle \textit{nonzero-sum} payoff structures and continuous settings. In particular, we study the MARL problem in a class of games known as stochastic potential games (SPGs) with continuous state-action spaces. Unlike cooperative games, in which all agents share a common reward, SPGs are capable of modelling real-world scenarios where agents seek to fulfil their individual goals. We prove theoretically our learning method, $\ourmethod$, enables independent agents to learn Nash equilibrium strategies in \textit{polynomial time}. We demonstrate our framework tackles previously unsolvable tasks such as \textit{Coordination Navigation} and \textit{large selfish routing games} and that it outperforms the state of the art MARL baselines such as MADDPG and COMIX in such scenarios.

Cite

Text

Mguni et al. "Learning in Nonzero-Sum Stochastic Games with Potentials." International Conference on Machine Learning, 2021.

Markdown

[Mguni et al. "Learning in Nonzero-Sum Stochastic Games with Potentials." International Conference on Machine Learning, 2021.](https://mlanthology.org/icml/2021/mguni2021icml-learning/)

BibTeX

@inproceedings{mguni2021icml-learning,
  title     = {{Learning in Nonzero-Sum Stochastic Games with Potentials}},
  author    = {Mguni, David H and Wu, Yutong and Du, Yali and Yang, Yaodong and Wang, Ziyi and Li, Minne and Wen, Ying and Jennings, Joel and Wang, Jun},
  booktitle = {International Conference on Machine Learning},
  year      = {2021},
  pages     = {7688-7699},
  volume    = {139},
  url       = {https://mlanthology.org/icml/2021/mguni2021icml-learning/}
}