Hedging in Games: Faster Convergence of External and Swap Regrets

Abstract

We consider the setting where players run the Hedge algorithm or its optimistic variant \cite{syrgkanis2015fast} to play an n-action game repeatedly for T rounds. 1) For two-player games, we show that the regret of optimistic Hedge decays at \tilde{O}( 1/T ^5/6 ), improving the previous bound O(1/T^3/4) by \cite{syrgkanis2015fast}. 2) In contrast, we show that the convergence rate of vanilla Hedge is no better than \tilde{\Omega}(1/ \sqrt{T})}, addressing an open question posted in \cite{syrgkanis2015fast}. For general m-player games, we show that the swap regret of each player decays at rate \tilde{O}(m^1/2 (n/T)^3/4) when they combine optimistic Hedge with the classical external-to-internal reduction of Blum and Mansour \cite{blum2007external}. The algorithm can also be modified to achieve the same rate against itself and a rate of \tilde{O}(\sqrt{n/T}) against adversaries. Via standard connections, our upper bounds also imply faster convergence to coarse correlated equilibria in two-player games and to correlated equilibria in multiplayer games.

Cite

Text

Chen and Peng. "Hedging in Games: Faster Convergence of External and Swap Regrets." Neural Information Processing Systems, 2020.

Markdown

[Chen and Peng. "Hedging in Games: Faster Convergence of External and Swap Regrets." Neural Information Processing Systems, 2020.](https://mlanthology.org/neurips/2020/chen2020neurips-hedging/)

BibTeX

@inproceedings{chen2020neurips-hedging,
  title     = {{Hedging in Games: Faster Convergence of External and Swap Regrets}},
  author    = {Chen, Xi and Peng, Binghui},
  booktitle = {Neural Information Processing Systems},
  year      = {2020},
  url       = {https://mlanthology.org/neurips/2020/chen2020neurips-hedging/}
}