Cooperative Heterogeneous Deep Reinforcement Learning

Han Zheng, Pengfei Wei, Jing Jiang, Guodong Long, Qinghua Lu, Chengqi Zhang

NeurIPS 2020

/neurips/2020/zheng2020neurips-cooperative/

Abstract

Numerous deep reinforcement learning agents have been proposed, and each of them has its strengths and flaws. In this work, we present a Cooperative Heterogeneous Deep Reinforcement Learning (CHDRL) framework that can learn a policy by integrating the advantages of heterogeneous agents. Specifically, we propose a cooperative learning framework that classifies heterogeneous agents into two classes: global agents and local agents. Global agents are off-policy agents that can utilize experiences from the other agents. Local agents are either on-policy agents or population-based evolutionary algorithms (EAs) agents that can explore the local area effectively. We employ global agents, which are sample-efficient, to guide the learning of local agents so that local agents can benefit from the sample-efficient agents and simultaneously maintain their advantages, e.g., stability. Global agents also benefit from effective local searches. Experimental studies on a range of continuous control tasks from the Mujoco benchmark show that CHDRL achieves better performance compared with state-of-the-art baselines.

PDF NeurIPS Semantic Scholar

Cite

Text

Zheng et al. "Cooperative Heterogeneous Deep Reinforcement Learning." Neural Information Processing Systems, 2020.

Markdown

[Zheng et al. "Cooperative Heterogeneous Deep Reinforcement Learning." Neural Information Processing Systems, 2020.](https://mlanthology.org/neurips/2020/zheng2020neurips-cooperative/)

BibTeX

@inproceedings{zheng2020neurips-cooperative,
  title     = {{Cooperative Heterogeneous Deep Reinforcement Learning}},
  author    = {Zheng, Han and Wei, Pengfei and Jiang, Jing and Long, Guodong and Lu, Qinghua and Zhang, Chengqi},
  booktitle = {Neural Information Processing Systems},
  year      = {2020},
  url       = {https://mlanthology.org/neurips/2020/zheng2020neurips-cooperative/}
}