Evolutionary Dynamics of Q-Learning over the Sequence Form

Abstract

Multi-agent learning is a challenging open task in artificial intelligence. It is known an interesting connection between multi-agent learning algorithms and evolutionary game theory, showing that the learning dynamics of some algorithms can be modeled as replicator dynamics with a mutation term. Inspired by the recent sequence-form replicator dynamics, we develop a new version of the Q-learning algorithm working on the sequence form of an extensive-form game allowing thus an exponential reduction of the dynamics length w.r.t. those of the normal form. The dynamics of the proposed algorithm can be modeled by using the sequence-form replicator dynamics with a mutation term. We show that, although sequence-form and normal-form replicator dynamics are realization equivalent, the Q-learning algorithm applied to the two forms have non-realization equivalent dynamics. Originally from the previous works on evolutionary game theory models form multi-agent learning, we produce an experimental evaluation to show the accuracy of the model.

Cite

Text

Panozzo et al. "Evolutionary Dynamics of Q-Learning over the Sequence Form." AAAI Conference on Artificial Intelligence, 2014. doi:10.1609/AAAI.V28I1.9012

Markdown

[Panozzo et al. "Evolutionary Dynamics of Q-Learning over the Sequence Form." AAAI Conference on Artificial Intelligence, 2014.](https://mlanthology.org/aaai/2014/panozzo2014aaai-evolutionary/) doi:10.1609/AAAI.V28I1.9012

BibTeX

@inproceedings{panozzo2014aaai-evolutionary,
  title     = {{Evolutionary Dynamics of Q-Learning over the Sequence Form}},
  author    = {Panozzo, Fabio and Gatti, Nicola and Restelli, Marcello},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {2014},
  pages     = {2034-2040},
  doi       = {10.1609/AAAI.V28I1.9012},
  url       = {https://mlanthology.org/aaai/2014/panozzo2014aaai-evolutionary/}
}