A Polynomial-Time Nash Equilibrium Algorithm for Repeated Stochastic Games
Abstract
We present a polynomial-time algorithm that always finds an (approximate) Nash equilibrium for repeated two-player stochastic games. The algorithm exploits the folk theorem to derive a strategy profile that forms an equilibrium by buttressing mutually beneficial behavior with threats, where possible. One component of our algorithm efficiently searches for an approximation of the egalitarian point, the fairest pareto-efficient solution. The paper concludes by applying the algorithm to a set of grid games to illustrate typical solutions the algorithm finds. These solutions compare very favorably to those found by competing algorithms, resulting in strategies with higher social welfare, as well as guaranteed computational efficiency.
Cite
Text
de Cote and Littman. "A Polynomial-Time Nash Equilibrium Algorithm for Repeated Stochastic Games." Conference on Uncertainty in Artificial Intelligence, 2008.Markdown
[de Cote and Littman. "A Polynomial-Time Nash Equilibrium Algorithm for Repeated Stochastic Games." Conference on Uncertainty in Artificial Intelligence, 2008.](https://mlanthology.org/uai/2008/decote2008uai-polynomial/)BibTeX
@inproceedings{decote2008uai-polynomial,
title = {{A Polynomial-Time Nash Equilibrium Algorithm for Repeated Stochastic Games}},
author = {de Cote, Enrique Munoz and Littman, Michael L.},
booktitle = {Conference on Uncertainty in Artificial Intelligence},
year = {2008},
pages = {419-426},
url = {https://mlanthology.org/uai/2008/decote2008uai-polynomial/}
}