Markov Games as a Framework for Multi-Agent Reinforcement Learning

Littman, Michael L.

doi:10.1016/B978-1-55860-335-6.50027-1

Markov Games as a Framework for Multi-Agent Reinforcement Learning

Michael L. Littman

ICML 1994 pp. 157-163

doi:10.1016/B978-1-55860-335-6.50027-1 /icml/1994/littman1994icml-markov/

Abstract

In the Markov decision process (MDP) formalization of reinforcement learning, a single adaptive agent interacts with an environment defined by a probabilistic transition function. In this solipsis-tic view, secondary agents can only be part of the environment and are therefore fixed in their behavior. The framework of Markov games allows us to widen this view to include multiple adaptive agents with interacting or competing goals. This paper considers a step in this direction in which exactly two agents with diametrically opposed goals share an environment. It describes a Q-learning-like algorithm for finding optimal policies and demonstrates its application to a simple two-player game in which the optimal policy is probabilistic.

PDF ICML Semantic Scholar

Cite

Text

Littman. "Markov Games as a Framework for Multi-Agent Reinforcement Learning." International Conference on Machine Learning, 1994. doi:10.1016/B978-1-55860-335-6.50027-1

Markdown

[Littman. "Markov Games as a Framework for Multi-Agent Reinforcement Learning." International Conference on Machine Learning, 1994.](https://mlanthology.org/icml/1994/littman1994icml-markov/) doi:10.1016/B978-1-55860-335-6.50027-1

BibTeX

@inproceedings{littman1994icml-markov,
  title     = {{Markov Games as a Framework for Multi-Agent Reinforcement Learning}},
  author    = {Littman, Michael L.},
  booktitle = {International Conference on Machine Learning},
  year      = {1994},
  pages     = {157-163},
  doi       = {10.1016/B978-1-55860-335-6.50027-1},
  url       = {https://mlanthology.org/icml/1994/littman1994icml-markov/}
}