Mutation-Driven Follow the Regularized Leader for Last-Iterate Convergence in Zero-Sum Games

Abstract

In this study, we consider a variant of the Follow the Regularized Leader (FTRL) dynamics in two-player zero-sum games. FTRL is guaranteed to converge to a Nash equilibrium when time-averaging the strategies, while a lot of variants suffer from the issue of limit cycling behavior, i.e., lack the last-iterate convergence guarantee. To this end, we propose mutant FTRL (M-FTRL), an algorithm that introduces mutation for the perturbation of action probabilities. We then investigate the continuous-time dynamics of M-FTRL and provide the strong convergence guarantees toward stationary points that approximate Nash equilibria under full-information feedback. Furthermore, our simulation demonstrates that M-FTRL can enjoy faster convergence rates than FTRL and optimistic FTRL under full-information feedback and surprisingly exhibits clear convergence under bandit feedback.

Cite

Text

Abe et al. "Mutation-Driven Follow the Regularized Leader for Last-Iterate Convergence in Zero-Sum Games." Uncertainty in Artificial Intelligence, 2022.

Markdown

[Abe et al. "Mutation-Driven Follow the Regularized Leader for Last-Iterate Convergence in Zero-Sum Games." Uncertainty in Artificial Intelligence, 2022.](https://mlanthology.org/uai/2022/abe2022uai-mutationdriven/)

BibTeX

@inproceedings{abe2022uai-mutationdriven,
  title     = {{Mutation-Driven Follow the Regularized Leader for Last-Iterate Convergence in Zero-Sum Games}},
  author    = {Abe, Kenshi and Sakamoto, Mitsuki and Iwasaki, Atsushi},
  booktitle = {Uncertainty in Artificial Intelligence},
  year      = {2022},
  pages     = {1-10},
  volume    = {180},
  url       = {https://mlanthology.org/uai/2022/abe2022uai-mutationdriven/}
}