Optimal Exploration Algorithm of Multi-Agent Reinforcement Learning Methods (Student Abstract)
Abstract
Exploration efficiency challenges for multi-agent reinforcement learning (MARL), as the policy learned by confederate MARL depends on the interaction among agents. Less informative reward also restricts the learning speed of MARL in comparison with the informative label in supervised learning. This paper proposes a novel communication method which helps agents focus on different exploration subarea to guide MARL to accelerate exploration. We propose a predictive network to forecast the reward of current state-action pair and use the guidance learned by the predictive network to modify the reward function. An improved prioritized experience replay is employed to help agents better take advantage of the different knowledge learned by different agents. Experimental results demonstrate that the proposed algorithm outperforms existing methods in cooperative multi-agent environments.
Cite
Text
Wang et al. "Optimal Exploration Algorithm of Multi-Agent Reinforcement Learning Methods (Student Abstract)." AAAI Conference on Artificial Intelligence, 2020. doi:10.1609/AAAI.V34I10.7247Markdown
[Wang et al. "Optimal Exploration Algorithm of Multi-Agent Reinforcement Learning Methods (Student Abstract)." AAAI Conference on Artificial Intelligence, 2020.](https://mlanthology.org/aaai/2020/wang2020aaai-optimal/) doi:10.1609/AAAI.V34I10.7247BibTeX
@inproceedings{wang2020aaai-optimal,
title = {{Optimal Exploration Algorithm of Multi-Agent Reinforcement Learning Methods (Student Abstract)}},
author = {Wang, Qisheng and Wang, Qichao and Li, Xiao},
booktitle = {AAAI Conference on Artificial Intelligence},
year = {2020},
pages = {13949-13950},
doi = {10.1609/AAAI.V34I10.7247},
url = {https://mlanthology.org/aaai/2020/wang2020aaai-optimal/}
}