Near-Optimal $φ$-Regret Learning in Extensive-Form Games
Abstract
In this paper, we establish efficient and uncoupled learning dynamics so that, when employed by all players in multiplayer perfect-recall imperfect-information extensive-form games, the trigger regret of each player grows as $O(\log T)$ after $T$ repetitions of play. This improves exponentially over the prior best known trigger-regret bound of $O(T^{1/4})$, and settles a recent open question by Bai et al. (2022). As an immediate consequence, we guarantee convergence to the set of extensive-form correlated equilibria and coarse correlated equilibria at a near-optimal rate of $\frac{\log T}{T}$. Building on prior work, at the heart of our construction lies a more general result regarding fixed points deriving from rational functions with polynomial degree, a property that we establish for the fixed points of (coarse) trigger deviation functions. Moreover, our construction leverages a refined regret circuit for the convex hull, which—unlike prior guarantees—preserves the RVU property introduced by Syrgkanis et al. (NIPS, 2015); this observation has an independent interest in establishing near-optimal regret under learning dynamics based on a CFR-type decomposition of the regret.
Cite
Text
Anagnostides et al. "Near-Optimal $φ$-Regret Learning in Extensive-Form Games." International Conference on Machine Learning, 2023.Markdown
[Anagnostides et al. "Near-Optimal $φ$-Regret Learning in Extensive-Form Games." International Conference on Machine Learning, 2023.](https://mlanthology.org/icml/2023/anagnostides2023icml-nearoptimal/)BibTeX
@inproceedings{anagnostides2023icml-nearoptimal,
title = {{Near-Optimal $φ$-Regret Learning in Extensive-Form Games}},
author = {Anagnostides, Ioannis and Farina, Gabriele and Sandholm, Tuomas},
booktitle = {International Conference on Machine Learning},
year = {2023},
pages = {814-839},
volume = {202},
url = {https://mlanthology.org/icml/2023/anagnostides2023icml-nearoptimal/}
}