Policy Mirror Ascent for Efficient and Independent Learning in Mean Field Games

Abstract

Mean-field games have been used as a theoretical tool to obtain an approximate Nash equilibrium for symmetric and anonymous $N$-player games. However, limiting applicability, existing theoretical results assume variations of a “population generative model”, which allows arbitrary modifications of the population distribution by the learning algorithm. Moreover, learning algorithms typically work on abstract simulators with population instead of the $N$-player game. Instead, we show that $N$ agents running policy mirror ascent converge to the Nash equilibrium of the regularized game within $\widetilde{\mathcal{O}}(\varepsilon^{-2})$ samples from a single sample trajectory without a population generative model, up to a standard $\mathcal{O}(\frac{1}{\sqrt{N}})$ error due to the mean field. Taking a divergent approach from the literature, instead of working with the best-response map we first show that a policy mirror ascent map can be used to construct a contractive operator having the Nash equilibrium as its fixed point. We analyze single-path TD learning for $N$-agent games, proving sample complexity guarantees by only using a sample path from the $N$-agent simulator without a population generative model. Furthermore, we demonstrate that our methodology allows for independent learning by $N$ agents with finite sample guarantees.

Cite

Text

Yardim et al. "Policy Mirror Ascent for Efficient and Independent Learning in Mean Field Games." International Conference on Machine Learning, 2023.

Markdown

[Yardim et al. "Policy Mirror Ascent for Efficient and Independent Learning in Mean Field Games." International Conference on Machine Learning, 2023.](https://mlanthology.org/icml/2023/yardim2023icml-policy/)

BibTeX

@inproceedings{yardim2023icml-policy,
  title     = {{Policy Mirror Ascent for Efficient and Independent Learning in Mean Field Games}},
  author    = {Yardim, Batuhan and Cayci, Semih and Geist, Matthieu and He, Niao},
  booktitle = {International Conference on Machine Learning},
  year      = {2023},
  pages     = {39722-39754},
  volume    = {202},
  url       = {https://mlanthology.org/icml/2023/yardim2023icml-policy/}
}