Convergence of No-Swap-Regret Dynamics in Self-Play
Abstract
In this paper, we investigate the question of whether no-swap-regret dynamics have stronger convergence properties in repeated games than regular no-external-regret dynamics. We prove that in almost all symmetric zero-sum games under symmetric initializations of the agents, no-swap-regret dynamics in self-play are guaranteed to converge in a strong ``frequent-iterate'' sense to the Nash equilibrium: in all but a vanishing fraction of the rounds, the players must play a strategy profile close to a symmetric Nash equilibrium. Remarkably, relaxing any of these three constraints, i.e. by allowing either i) asymmetric initial conditions, or ii) an asymmetric game or iii) no-external regret dynamics suffices to destroy this result and lead to complex non-equilibrating or even chaotic behavior. In a dual type of result, we show that the power of no-swap-regret dynamics comes at a cost of imposing a time-asymmetry on its inputs. While no-external-regret dynamics can be completely determined by the cumulative reward vector received by each player, we show there does not exist any general no-swap-regret dynamics defined on the same state space. In fact, we prove that any no-swap-regret learning algorithm must play a time-asymmetric function over the set of previously observed rewards, ruling out any dynamics based on a symmetric function of the current set of rewards.
Cite
Text
Leme et al. "Convergence of No-Swap-Regret Dynamics in Self-Play." Neural Information Processing Systems, 2024. doi:10.52202/079017-2426Markdown
[Leme et al. "Convergence of No-Swap-Regret Dynamics in Self-Play." Neural Information Processing Systems, 2024.](https://mlanthology.org/neurips/2024/leme2024neurips-convergence/) doi:10.52202/079017-2426BibTeX
@inproceedings{leme2024neurips-convergence,
title = {{Convergence of No-Swap-Regret Dynamics in Self-Play}},
author = {Leme, Renato Paes and Piliouras, Georgios and Schneider, Jon},
booktitle = {Neural Information Processing Systems},
year = {2024},
doi = {10.52202/079017-2426},
url = {https://mlanthology.org/neurips/2024/leme2024neurips-convergence/}
}