Intrinsic Action Tendency Consistency for Cooperative Multi-Agent Reinforcement Learning
Abstract
Efficient collaboration in the centralized training with decentralized execution (CTDE) paradigm remains a challenge in cooperative multi-agent systems. We identify divergent action tendencies among agents as a significant obstacle to CTDE's training efficiency, requiring a large number of training samples to achieve a unified consensus on agents' policies. This divergence stems from the lack of adequate team consensus-related guidance signals during credit assignment in CTDE. To address this, we propose Intrinsic Action Tendency Consistency, a novel approach for cooperative multi-agent reinforcement learning. It integrates intrinsic rewards, obtained through an action model, into a reward-additive CTDE (RA-CTDE) framework. We formulate an action model that enables surrounding agents to predict the central agent's action tendency. Leveraging these predictions, we compute a cooperative intrinsic reward that encourages agents to align their actions with their neighbors' predictions. We establish the equivalence between RA-CTDE and CTDE through theoretical analyses, demonstrating that CTDE's training process can be achieved using N individual targets. Building on this insight, we introduce a novel method to combine intrinsic rewards and RA-CTDE. Extensive experiments on challenging tasks in SMAC, MPE, and GRF benchmarks showcase the improved performance of our method.
Cite
Text
Zhang et al. "Intrinsic Action Tendency Consistency for Cooperative Multi-Agent Reinforcement Learning." AAAI Conference on Artificial Intelligence, 2024. doi:10.1609/AAAI.V38I16.29711Markdown
[Zhang et al. "Intrinsic Action Tendency Consistency for Cooperative Multi-Agent Reinforcement Learning." AAAI Conference on Artificial Intelligence, 2024.](https://mlanthology.org/aaai/2024/zhang2024aaai-intrinsic/) doi:10.1609/AAAI.V38I16.29711BibTeX
@inproceedings{zhang2024aaai-intrinsic,
title = {{Intrinsic Action Tendency Consistency for Cooperative Multi-Agent Reinforcement Learning}},
author = {Zhang, Junkai and Zhang, Yifan and Zhang, Xi Sheryl and Zang, Yifan and Cheng, Jian},
booktitle = {AAAI Conference on Artificial Intelligence},
year = {2024},
pages = {17600-17608},
doi = {10.1609/AAAI.V38I16.29711},
url = {https://mlanthology.org/aaai/2024/zhang2024aaai-intrinsic/}
}