Self-Organized Group for Cooperative Multi-Agent Reinforcement Learning

Abstract

Centralized training with decentralized execution (CTDE) has achieved great success in cooperative multi-agent reinforcement learning (MARL) in practical applications. However, CTDE-based methods typically suffer from poor zero-shot generalization ability with dynamic team composition and varying partial observability. To tackle these issues, we propose a spontaneously grouping mechanism, termed Self-Organized Group (SOG), which is featured with conductor election (CE) and message summary (MS). In CE, a certain number of conductors are elected every $T$ time-steps to temporally construct groups, each with conductor-follower consensus where the followers are constrained to only communicate with their conductor. In MS, each conductor summarize and distribute the received messages to all affiliate group members to hold a unified scheduling. SOG provides zero-shot generalization ability to the dynamic number of agents and the varying partial observability. Sufficient experiments on mainstream multi-agent benchmarks exhibit superiority of SOG.

Cite

Text

Shao et al. "Self-Organized Group for Cooperative Multi-Agent Reinforcement Learning." Neural Information Processing Systems, 2022.

Markdown

[Shao et al. "Self-Organized Group for Cooperative Multi-Agent Reinforcement Learning." Neural Information Processing Systems, 2022.](https://mlanthology.org/neurips/2022/shao2022neurips-selforganized/)

BibTeX

@inproceedings{shao2022neurips-selforganized,
  title     = {{Self-Organized Group for Cooperative Multi-Agent Reinforcement Learning}},
  author    = {Shao, Jianzhun and Lou, Zhiqiang and Zhang, Hongchang and Jiang, Yuhang and He, Shuncheng and Ji, Xiangyang},
  booktitle = {Neural Information Processing Systems},
  year      = {2022},
  url       = {https://mlanthology.org/neurips/2022/shao2022neurips-selforganized/}
}