Off-Policy Meta-Reinforcement Learning Based on Feature Embedding Spaces
Abstract
Meta-reinforcement learning (RL) addresses the problem of sample inefficiency in deep RL by using experience obtained in past tasks for a new task to be solved. However, most meta-RL methods require partially or fully on-policy data, i.e., they cannot reuse the data collected by past policies, which hinders the improvement of sample efficiency. To alleviate this problem, we propose a novel off-policy meta-RL method, embedding learning and evaluation of uncertainty (ELUE). ELUE is characterized by the learning of a shared feature embedding space among tasks. It learns beliefs over the embedding space and a belief conditional policy and Q-function. This approach has two major advantages. It can evaluate the uncertainty of tasks, which is expected to contribute to precise exploration, and it can also improve its performance by updating a belief. We show that our proposed method outperforms existing methods through experiments with a meta-RL benchmark.
Cite
Text
Imagawa et al. "Off-Policy Meta-Reinforcement Learning Based on Feature Embedding Spaces." ICML 2020 Workshops: LifelongML, 2020.Markdown
[Imagawa et al. "Off-Policy Meta-Reinforcement Learning Based on Feature Embedding Spaces." ICML 2020 Workshops: LifelongML, 2020.](https://mlanthology.org/icmlw/2020/imagawa2020icmlw-offpolicy/)BibTeX
@inproceedings{imagawa2020icmlw-offpolicy,
title = {{Off-Policy Meta-Reinforcement Learning Based on Feature Embedding Spaces}},
author = {Imagawa, Takahisa and Hiraoka, Takuya and Tsuruoka, Yoshimasa},
booktitle = {ICML 2020 Workshops: LifelongML},
year = {2020},
url = {https://mlanthology.org/icmlw/2020/imagawa2020icmlw-offpolicy/}
}