Exploration via Joint Policy Diversity for Sparse-Reward Multi-Agent Tasks
Abstract
Exploration under sparse rewards is a key challenge for multi-agent reinforcement learning problems. Previous works argue that complex dynamics between agents and the huge exploration space in MARL scenarios amplify the vulnerability of classical count-based exploration methods when combined with agents parameterized by neural networks, resulting in inefficient exploration. In this paper, we show that introducing constrained joint policy diversity into a classical count-based method can significantly improve exploration when agents are parameterized by neural networks. Specifically, we propose a joint policy diversity to measure the difference between current joint policy and previous joint policies, and then use a filtering-based exploration constraint to further refine the joint policy diversity. Under the sparse-reward setting, we show that the proposed method significantly outperforms the state-of-the-art methods in the multiple-particle environment, the Google Research Football, and StarCraft II micromanagement tasks. To the best of our knowledge, on the hard 3s_vs_5z task which needs non-trivial strategies to defeat enemies, our method is the first to learn winning strategies without domain knowledge under the sparse-reward setting.
Cite
Text
Xu et al. "Exploration via Joint Policy Diversity for Sparse-Reward Multi-Agent Tasks." International Joint Conference on Artificial Intelligence, 2023. doi:10.24963/IJCAI.2023/37Markdown
[Xu et al. "Exploration via Joint Policy Diversity for Sparse-Reward Multi-Agent Tasks." International Joint Conference on Artificial Intelligence, 2023.](https://mlanthology.org/ijcai/2023/xu2023ijcai-exploration/) doi:10.24963/IJCAI.2023/37BibTeX
@inproceedings{xu2023ijcai-exploration,
title = {{Exploration via Joint Policy Diversity for Sparse-Reward Multi-Agent Tasks}},
author = {Xu, Pei and Zhang, Junge and Huang, Kaiqi},
booktitle = {International Joint Conference on Artificial Intelligence},
year = {2023},
pages = {326-334},
doi = {10.24963/IJCAI.2023/37},
url = {https://mlanthology.org/ijcai/2023/xu2023ijcai-exploration/}
}