Self-Organized Group for Cooperative Multi-Agent Reinforcement Learning
Abstract
Centralized training with decentralized execution (CTDE) has achieved great success in cooperative multi-agent reinforcement learning (MARL) in practical applications. However, CTDE-based methods typically suffer from poor zero-shot generalization ability with dynamic team composition and varying partial observability. To tackle these issues, we propose a spontaneously grouping mechanism, termed Self-Organized Group (SOG), which is featured with conductor election (CE) and message summary (MS). In CE, a certain number of conductors are elected every $T$ time-steps to temporally construct groups, each with conductor-follower consensus where the followers are constrained to only communicate with their conductor. In MS, each conductor summarize and distribute the received messages to all affiliate group members to hold a unified scheduling. SOG provides zero-shot generalization ability to the dynamic number of agents and the varying partial observability. Sufficient experiments on mainstream multi-agent benchmarks exhibit superiority of SOG.
Cite
Text
Shao et al. "Self-Organized Group for Cooperative Multi-Agent Reinforcement Learning." Neural Information Processing Systems, 2022.Markdown
[Shao et al. "Self-Organized Group for Cooperative Multi-Agent Reinforcement Learning." Neural Information Processing Systems, 2022.](https://mlanthology.org/neurips/2022/shao2022neurips-selforganized/)BibTeX
@inproceedings{shao2022neurips-selforganized,
title = {{Self-Organized Group for Cooperative Multi-Agent Reinforcement Learning}},
author = {Shao, Jianzhun and Lou, Zhiqiang and Zhang, Hongchang and Jiang, Yuhang and He, Shuncheng and Ji, Xiangyang},
booktitle = {Neural Information Processing Systems},
year = {2022},
url = {https://mlanthology.org/neurips/2022/shao2022neurips-selforganized/}
}