Bounding Regret in Empirical Games

Abstract

Empirical game-theoretic analysis refers to a set of models and techniques for solving large-scale games. However, there is a lack of a quantitative guarantee about the quality of output approximate Nash equilibria (NE). A natural quantitative guarantee for such an approximate NE is the regret in the game (i.e. the best deviation gain). We formulate this deviation gain computation as a multi-armed bandit problem, with a new optimization goal unlike those studied in prior work. We propose an efficient algorithm Super-Arm UCB (SAUCB) for the problem and a number of variants. We present sample complexity results as well as extensive experiments that show the better performance of SAUCB compared to several baselines.

Cite

Text

Jecmen et al. "Bounding Regret in Empirical Games." AAAI Conference on Artificial Intelligence, 2020. doi:10.1609/AAAI.V34I04.5851

Markdown

[Jecmen et al. "Bounding Regret in Empirical Games." AAAI Conference on Artificial Intelligence, 2020.](https://mlanthology.org/aaai/2020/jecmen2020aaai-bounding/) doi:10.1609/AAAI.V34I04.5851

BibTeX

@inproceedings{jecmen2020aaai-bounding,
  title     = {{Bounding Regret in Empirical Games}},
  author    = {Jecmen, Steven and Sinha, Arunesh and Li, Zun and Tran-Thanh, Long},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {2020},
  pages     = {4280-4287},
  doi       = {10.1609/AAAI.V34I04.5851},
  url       = {https://mlanthology.org/aaai/2020/jecmen2020aaai-bounding/}
}