A Near-Optimal Poly-Time Algorithm for Learning a Class of Stochastic Games

Brafman, Ronen I.; Tennenholtz, Moshe

A Near-Optimal Poly-Time Algorithm for Learning a Class of Stochastic Games

IJCAI 1999 pp. 734-739

/ijcai/1999/brafman1999ijcai-near/

Abstract

We present a new algorithm for polynomial time learning of near optimal behavior in stochastic games. This algorithm incorporates and integrates important recent results of Kearns and Singh [ 1998] in reinforcement learning and of Monderer and Tennenholtz [1997] in repeated games. In stochastic games we face an exploration vs. exploitation dilemma more complex than in Markov decision processes. Namely, given information about particular parts of a game matrix, how much effort should the agent invest in learning its unknown parts. We explain and address these issues within the class of single controller stochastic games. This solution can be extended to stochastic games in general. 1

PDF Semantic Scholar

Cite

Text

Brafman and Tennenholtz. "A Near-Optimal Poly-Time Algorithm for Learning a Class of Stochastic Games." International Joint Conference on Artificial Intelligence, 1999.

Markdown

[Brafman and Tennenholtz. "A Near-Optimal Poly-Time Algorithm for Learning a Class of Stochastic Games." International Joint Conference on Artificial Intelligence, 1999.](https://mlanthology.org/ijcai/1999/brafman1999ijcai-near/)

BibTeX

@inproceedings{brafman1999ijcai-near,
  title     = {{A Near-Optimal Poly-Time Algorithm for Learning a Class of Stochastic Games}},
  author    = {Brafman, Ronen I. and Tennenholtz, Moshe},
  booktitle = {International Joint Conference on Artificial Intelligence},
  year      = {1999},
  pages     = {734-739},
  url       = {https://mlanthology.org/ijcai/1999/brafman1999ijcai-near/}
}