Perpetual Learning for Non-Cooperative Multiple Agents

Abstract

This paper examines, by argument, the dynamics of sequences of behavioural choices made, when non-cooperative restricted-memory agents learn in partially observable stochastic games. These sequences of combined agent strategies (joint-policies) can be thought of as a walk through the space of all possible joint-policies. We argue that this walk, while containing random elements, is also driven by each agent's drive to improve their current situation at each point, and posit a learning pressure field across policy space to represent this drive. Different learning choices may skew this learning pressure, and affect the simultaneous joint learning of multiple agents.

Cite

Text

Dickens. "Perpetual Learning for Non-Cooperative Multiple Agents." AAAI Conference on Artificial Intelligence, 2008.

Markdown

[Dickens. "Perpetual Learning for Non-Cooperative Multiple Agents." AAAI Conference on Artificial Intelligence, 2008.](https://mlanthology.org/aaai/2008/dickens2008aaai-perpetual/)

BibTeX

@inproceedings{dickens2008aaai-perpetual,
  title     = {{Perpetual Learning for Non-Cooperative Multiple Agents}},
  author    = {Dickens, Luke},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {2008},
  pages     = {1792-1793},
  url       = {https://mlanthology.org/aaai/2008/dickens2008aaai-perpetual/}
}