Solving Stochastic Games

Abstract

Solving multi-agent reinforcement learning problems has proven difficult because of the lack of tractable algorithms. We provide the first approximation algorithm which solves stochastic games to within $\epsilon$ relative error of the optimal game-theoretic solution, in time polynomial in $1/\epsilon$. Our algorithm extends Murrays and Gordon’s (2007) modified Bellman equation which determines the \emph{set} of all possible achievable utilities; this provides us a truly general framework for multi-agent learning. Further, we empirically validate our algorithm and find the computational cost to be orders of magnitude less than what the theory predicts.

Cite

Text

Dermed and Isbell. "Solving Stochastic Games." Neural Information Processing Systems, 2009.

Markdown

[Dermed and Isbell. "Solving Stochastic Games." Neural Information Processing Systems, 2009.](https://mlanthology.org/neurips/2009/dermed2009neurips-solving/)

BibTeX

@inproceedings{dermed2009neurips-solving,
  title     = {{Solving Stochastic Games}},
  author    = {Dermed, Liam M. and Isbell, Charles L.},
  booktitle = {Neural Information Processing Systems},
  year      = {2009},
  pages     = {1186-1194},
  url       = {https://mlanthology.org/neurips/2009/dermed2009neurips-solving/}
}