Stabilising Experience Replay for Deep Multi-Agent Reinforcement Learning

Abstract

Many real-world problems, such as network packet routing and urban traffic control, are naturally modeled as multi-agent reinforcement learning (RL) problems. However, existing multi-agent RL methods typically scale poorly in the problem size. Therefore, a key challenge is to translate the success of deep learning on single-agent RL to the multi-agent setting. A major stumbling block is that independent Q-learning, the most popular multi-agent RL method, introduces nonstationarity that makes it incompatible with the experience replay memory on which deep Q-learning relies. This paper proposes two methods that address this problem: 1) using a multi-agent variant of importance sampling to naturally decay obsolete data and 2) conditioning each agent’s value function on a fingerprint that disambiguates the age of the data sampled from the replay memory. Results on a challenging decentralised variant of StarCraft unit micromanagement confirm that these methods enable the successful combination of experience replay with multi-agent RL.

Cite

Text

Foerster et al. "Stabilising Experience Replay for Deep Multi-Agent Reinforcement Learning." International Conference on Machine Learning, 2017.

Markdown

[Foerster et al. "Stabilising Experience Replay for Deep Multi-Agent Reinforcement Learning." International Conference on Machine Learning, 2017.](https://mlanthology.org/icml/2017/foerster2017icml-stabilising/)

BibTeX

@inproceedings{foerster2017icml-stabilising,
  title     = {{Stabilising Experience Replay for Deep Multi-Agent Reinforcement Learning}},
  author    = {Foerster, Jakob and Nardelli, Nantas and Farquhar, Gregory and Afouras, Triantafyllos and Torr, Philip H. S. and Kohli, Pushmeet and Whiteson, Shimon},
  booktitle = {International Conference on Machine Learning},
  year      = {2017},
  pages     = {1146-1155},
  volume    = {70},
  url       = {https://mlanthology.org/icml/2017/foerster2017icml-stabilising/}
}