MetaCURE: Meta Reinforcement Learning with Empowerment-Driven Exploration
Abstract
Meta reinforcement learning (meta-RL) extracts knowledge from previous tasks and achieves fast adaptation to new tasks. Despite recent progress, efficient exploration in meta-RL remains a key challenge in sparse-reward tasks, as it requires quickly finding informative task-relevant experiences in both meta-training and adaptation. To address this challenge, we explicitly model an exploration policy learning problem for meta-RL, which is separated from exploitation policy learning, and introduce a novel empowerment-driven exploration objective, which aims to maximize information gain for task identification. We derive a corresponding intrinsic reward and develop a new off-policy meta-RL framework, which efficiently learns separate context-aware exploration and exploitation policies by sharing the knowledge of task inference. Experimental evaluation shows that our meta-RL method significantly outperforms state-of-the-art baselines on various sparse-reward MuJoCo locomotion tasks and more complex sparse-reward Meta-World tasks.
Cite
Text
Zhang et al. "MetaCURE: Meta Reinforcement Learning with Empowerment-Driven Exploration." International Conference on Machine Learning, 2021.Markdown
[Zhang et al. "MetaCURE: Meta Reinforcement Learning with Empowerment-Driven Exploration." International Conference on Machine Learning, 2021.](https://mlanthology.org/icml/2021/zhang2021icml-metacure/)BibTeX
@inproceedings{zhang2021icml-metacure,
title = {{MetaCURE: Meta Reinforcement Learning with Empowerment-Driven Exploration}},
author = {Zhang, Jin and Wang, Jianhao and Hu, Hao and Chen, Tong and Chen, Yingfeng and Fan, Changjie and Zhang, Chongjie},
booktitle = {International Conference on Machine Learning},
year = {2021},
pages = {12600-12610},
volume = {139},
url = {https://mlanthology.org/icml/2021/zhang2021icml-metacure/}
}