Perpetual Learning for Non-Cooperative Multiple Agents
Abstract
This paper examines, by argument, the dynamics of sequences of behavioural choices made, when non-cooperative restricted-memory agents learn in partially observable stochastic games. These sequences of combined agent strategies (joint-policies) can be thought of as a walk through the space of all possible joint-policies. We argue that this walk, while containing random elements, is also driven by each agent's drive to improve their current situation at each point, and posit a learning pressure field across policy space to represent this drive. Different learning choices may skew this learning pressure, and affect the simultaneous joint learning of multiple agents.
Cite
Text
Dickens. "Perpetual Learning for Non-Cooperative Multiple Agents." AAAI Conference on Artificial Intelligence, 2008.Markdown
[Dickens. "Perpetual Learning for Non-Cooperative Multiple Agents." AAAI Conference on Artificial Intelligence, 2008.](https://mlanthology.org/aaai/2008/dickens2008aaai-perpetual/)BibTeX
@inproceedings{dickens2008aaai-perpetual,
title = {{Perpetual Learning for Non-Cooperative Multiple Agents}},
author = {Dickens, Luke},
booktitle = {AAAI Conference on Artificial Intelligence},
year = {2008},
pages = {1792-1793},
url = {https://mlanthology.org/aaai/2008/dickens2008aaai-perpetual/}
}