Evolutionary Reinforcement Learning for Sample-Efficient Multiagent Coordination

Abstract

Many cooperative multiagent reinforcement learning environments provide agents with a sparse team-based reward, as well as a dense agent-specific reward that incentivizes learning basic skills. Training policies solely on the team-based reward is often difficult due to its sparsity. Also, relying solely on the agent-specific reward is sub-optimal because it usually does not capture the team coordination objective. A common approach is to use reward shaping to construct a proxy reward by combining the individual rewards. However, this requires manual tuning for each environment. We introduce Multiagent Evolutionary Reinforcement Learning (MERL), a split-level training platform that handles the two objectives separately through two optimization processes. An evolutionary algorithm maximizes the sparse team-based objective through neuroevolution on a population of teams. Concurrently, a gradient-based optimizer trains policies to only maximize the dense agent-specific rewards. The gradient-based policies are periodically added to the evolutionary population as a way of information transfer between the two optimization processes. This enables the evolutionary algorithm to use skills learned via the agent-specific rewards toward optimizing the global objective. Results demonstrate that MERL significantly outperforms state-of-the-art methods, such as MADDPG, on a number of difficult coordination benchmarks.

Cite

Text

Majumdar et al. "Evolutionary Reinforcement Learning for Sample-Efficient Multiagent Coordination." International Conference on Machine Learning, 2020.

Markdown

[Majumdar et al. "Evolutionary Reinforcement Learning for Sample-Efficient Multiagent Coordination." International Conference on Machine Learning, 2020.](https://mlanthology.org/icml/2020/majumdar2020icml-evolutionary/)

BibTeX

@inproceedings{majumdar2020icml-evolutionary,
  title     = {{Evolutionary Reinforcement Learning for Sample-Efficient Multiagent Coordination}},
  author    = {Majumdar, Somdeb and Khadka, Shauharda and Miret, Santiago and Mcaleer, Stephen and Tumer, Kagan},
  booktitle = {International Conference on Machine Learning},
  year      = {2020},
  pages     = {6651-6660},
  volume    = {119},
  url       = {https://mlanthology.org/icml/2020/majumdar2020icml-evolutionary/}
}