Towards Effective Context for Meta-Reinforcement Learning: An Approach Based on Contrastive Learning
Abstract
Context, the embedding of previous collected trajectories, is a powerful construct for Meta-Reinforcement Learning (Meta-RL) algorithms. By conditioning on an effective context, Meta-RL policies can easily generalize to new tasks within a few adaptation steps. We argue that improving the quality of context involves answering two questions: 1. How to train a compact and sufficient encoder that can embed the task-specific information contained in prior trajectories? 2. How to collect informative trajectories of which the corresponding context reflects the specification of tasks? To this end, we propose a novel Meta-RL framework called CCM (Contrastive learning augmented Context-based Meta-RL). We first focus on the contrastive nature behind different tasks and leverage it to train a compact and sufficient context encoder. Further, we train a separate exploration policy and theoretically derive a new information-gain-based objective which aims to collect informative trajectories in a few steps. Empirically, we evaluate our approaches on common benchmarks as well as several complex sparse-reward environments. The experimental results show that CCM outperforms state-of-the-art algorithms by addressing previously mentioned problems respectively.
Cite
Text
Fu et al. "Towards Effective Context for Meta-Reinforcement Learning: An Approach Based on Contrastive Learning." AAAI Conference on Artificial Intelligence, 2021. doi:10.1609/AAAI.V35I8.16914Markdown
[Fu et al. "Towards Effective Context for Meta-Reinforcement Learning: An Approach Based on Contrastive Learning." AAAI Conference on Artificial Intelligence, 2021.](https://mlanthology.org/aaai/2021/fu2021aaai-effective/) doi:10.1609/AAAI.V35I8.16914BibTeX
@inproceedings{fu2021aaai-effective,
title = {{Towards Effective Context for Meta-Reinforcement Learning: An Approach Based on Contrastive Learning}},
author = {Fu, Haotian and Tang, Hongyao and Hao, Jianye and Chen, Chen and Feng, Xidong and Li, Dong and Liu, Wulong},
booktitle = {AAAI Conference on Artificial Intelligence},
year = {2021},
pages = {7457-7465},
doi = {10.1609/AAAI.V35I8.16914},
url = {https://mlanthology.org/aaai/2021/fu2021aaai-effective/}
}