MASER: Multi-Agent Reinforcement Learning with Subgoals Generated from Experience Replay Buffer

Abstract

In this paper, we consider cooperative multi-agent reinforcement learning (MARL) with sparse reward. To tackle this problem, we propose a novel method named MASER: MARL with subgoals generated from experience replay buffer. Under the widely-used assumption of centralized training with decentralized execution and consistent Q-value decomposition for MARL, MASER automatically generates proper subgoals for multiple agents from the experience replay buffer by considering both individual Q-value and total Q-value. Then, MASER designs individual intrinsic reward for each agent based on actionable representation relevant to Q-learning so that the agents reach their subgoals while maximizing the joint action value. Numerical results show that MASER significantly outperforms StarCraft II micromanagement benchmark compared to other state-of-the-art MARL algorithms.

Cite

Text

Jeon et al. "MASER: Multi-Agent Reinforcement Learning with Subgoals Generated from Experience Replay Buffer." International Conference on Machine Learning, 2022.

Markdown

[Jeon et al. "MASER: Multi-Agent Reinforcement Learning with Subgoals Generated from Experience Replay Buffer." International Conference on Machine Learning, 2022.](https://mlanthology.org/icml/2022/jeon2022icml-maser/)

BibTeX

@inproceedings{jeon2022icml-maser,
  title     = {{MASER: Multi-Agent Reinforcement Learning with Subgoals Generated from Experience Replay Buffer}},
  author    = {Jeon, Jeewon and Kim, Woojun and Jung, Whiyoung and Sung, Youngchul},
  booktitle = {International Conference on Machine Learning},
  year      = {2022},
  pages     = {10041-10052},
  volume    = {162},
  url       = {https://mlanthology.org/icml/2022/jeon2022icml-maser/}
}