A Variational Approach to Mutual Information-Based Coordination for Multi-Agent Reinforcement Learning
Abstract
In this paper, we propose a new mutual information (MMI) framework for multi-agent reinforcement learning (MARL) to enable multiple agents to learn coordinated behaviors by regularizing the accumulated return with the mutual information between multi-agent actions. By introducing a latent variable to induce nonzero mutual information between multi-agent actions and applying a variational bound, we derive a tractable lower bound on the considered MMI-regularized objective function. Applying policy iteration to maximize the derived lower bound, we propose a practical algorithm named variational maximum mutual information multi-agent actor-critic (VM3-AC). We evaluated VM3-AC for several games requiring coordination, and numerical results show that VM3-AC outperforms other MARL algorithms in multi-agent tasks requiring coordination.
Cite
Text
Kim et al. "A Variational Approach to Mutual Information-Based Coordination for Multi-Agent Reinforcement Learning." ICML 2022 Workshops: AI4ABM, 2022.Markdown
[Kim et al. "A Variational Approach to Mutual Information-Based Coordination for Multi-Agent Reinforcement Learning." ICML 2022 Workshops: AI4ABM, 2022.](https://mlanthology.org/icmlw/2022/kim2022icmlw-variational/)BibTeX
@inproceedings{kim2022icmlw-variational,
title = {{A Variational Approach to Mutual Information-Based Coordination for Multi-Agent Reinforcement Learning}},
author = {Kim, Woojun and Jung, Whiyoung and Cho, Myungsik and Sung, Youngchul},
booktitle = {ICML 2022 Workshops: AI4ABM},
year = {2022},
url = {https://mlanthology.org/icmlw/2022/kim2022icmlw-variational/}
}