Bounding Regret in Empirical Games
Abstract
Empirical game-theoretic analysis refers to a set of models and techniques for solving large-scale games. However, there is a lack of a quantitative guarantee about the quality of output approximate Nash equilibria (NE). A natural quantitative guarantee for such an approximate NE is the regret in the game (i.e. the best deviation gain). We formulate this deviation gain computation as a multi-armed bandit problem, with a new optimization goal unlike those studied in prior work. We propose an efficient algorithm Super-Arm UCB (SAUCB) for the problem and a number of variants. We present sample complexity results as well as extensive experiments that show the better performance of SAUCB compared to several baselines.
Cite
Text
Jecmen et al. "Bounding Regret in Empirical Games." AAAI Conference on Artificial Intelligence, 2020. doi:10.1609/AAAI.V34I04.5851Markdown
[Jecmen et al. "Bounding Regret in Empirical Games." AAAI Conference on Artificial Intelligence, 2020.](https://mlanthology.org/aaai/2020/jecmen2020aaai-bounding/) doi:10.1609/AAAI.V34I04.5851BibTeX
@inproceedings{jecmen2020aaai-bounding,
title = {{Bounding Regret in Empirical Games}},
author = {Jecmen, Steven and Sinha, Arunesh and Li, Zun and Tran-Thanh, Long},
booktitle = {AAAI Conference on Artificial Intelligence},
year = {2020},
pages = {4280-4287},
doi = {10.1609/AAAI.V34I04.5851},
url = {https://mlanthology.org/aaai/2020/jecmen2020aaai-bounding/}
}