Optimal Exploration Algorithm of Multi-Agent Reinforcement Learning Methods (Student Abstract)

Abstract

Exploration efficiency challenges for multi-agent reinforcement learning (MARL), as the policy learned by confederate MARL depends on the interaction among agents. Less informative reward also restricts the learning speed of MARL in comparison with the informative label in supervised learning. This paper proposes a novel communication method which helps agents focus on different exploration subarea to guide MARL to accelerate exploration. We propose a predictive network to forecast the reward of current state-action pair and use the guidance learned by the predictive network to modify the reward function. An improved prioritized experience replay is employed to help agents better take advantage of the different knowledge learned by different agents. Experimental results demonstrate that the proposed algorithm outperforms existing methods in cooperative multi-agent environments.

Cite

Text

Wang et al. "Optimal Exploration Algorithm of Multi-Agent Reinforcement Learning Methods (Student Abstract)." AAAI Conference on Artificial Intelligence, 2020. doi:10.1609/AAAI.V34I10.7247

Markdown

[Wang et al. "Optimal Exploration Algorithm of Multi-Agent Reinforcement Learning Methods (Student Abstract)." AAAI Conference on Artificial Intelligence, 2020.](https://mlanthology.org/aaai/2020/wang2020aaai-optimal/) doi:10.1609/AAAI.V34I10.7247

BibTeX

@inproceedings{wang2020aaai-optimal,
  title     = {{Optimal Exploration Algorithm of Multi-Agent Reinforcement Learning Methods (Student Abstract)}},
  author    = {Wang, Qisheng and Wang, Qichao and Li, Xiao},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {2020},
  pages     = {13949-13950},
  doi       = {10.1609/AAAI.V34I10.7247},
  url       = {https://mlanthology.org/aaai/2020/wang2020aaai-optimal/}
}