Hybrid Learning for Multi-Agent Cooperation with Sub-Optimal Demonstrations
Abstract
This paper aims to learn multi-agent cooperation where each agent performs its actions in a decentralized way. In this case, it is very challenging to learn decentralized policies when the rewards are global and sparse. Recently, learning from demonstrations (LfD) provides a promising way to handle this challenge. However, in many practical tasks, the available demonstrations are often sub-optimal. To learn better policies from these sub-optimal demonstrations, this paper follows a centralized learning and decentralized execution framework and proposes a novel hybrid learning method based on multi-agent actor-critic. At first, the expert trajectory returns generated from demonstration actions are used to pre-train the centralized critic network. Then, multi-agent decisions are made by best response dynamics based on the critic and used to train the decentralized actor networks. Finally, the demonstrations are updated by the actor networks, and the critic and actor networks are learned jointly by running the above two steps alliteratively. We evaluate the proposed approach on a real-time strategy combat game. Experimental results show that the approach outperforms many competing demonstration-based methods.
Cite
Text
Peng et al. "Hybrid Learning for Multi-Agent Cooperation with Sub-Optimal Demonstrations." International Joint Conference on Artificial Intelligence, 2020. doi:10.24963/IJCAI.2020/420Markdown
[Peng et al. "Hybrid Learning for Multi-Agent Cooperation with Sub-Optimal Demonstrations." International Joint Conference on Artificial Intelligence, 2020.](https://mlanthology.org/ijcai/2020/peng2020ijcai-hybrid/) doi:10.24963/IJCAI.2020/420BibTeX
@inproceedings{peng2020ijcai-hybrid,
title = {{Hybrid Learning for Multi-Agent Cooperation with Sub-Optimal Demonstrations}},
author = {Peng, Peixi and Xing, Junliang and Cao, Lili},
booktitle = {International Joint Conference on Artificial Intelligence},
year = {2020},
pages = {3037-3043},
doi = {10.24963/IJCAI.2020/420},
url = {https://mlanthology.org/ijcai/2020/peng2020ijcai-hybrid/}
}