Off-Policy Meta-Reinforcement Learning Based on Feature Embedding Spaces

Abstract

Meta-reinforcement learning (RL) addresses the problem of sample inefficiency in deep RL by using experience obtained in past tasks for a new task to be solved. However, most meta-RL methods require partially or fully on-policy data, i.e., they cannot reuse the data collected by past policies, which hinders the improvement of sample efficiency. To alleviate this problem, we propose a novel off-policy meta-RL method, embedding learning and evaluation of uncertainty (ELUE). ELUE is characterized by the learning of a shared feature embedding space among tasks. It learns beliefs over the embedding space and a belief conditional policy and Q-function. This approach has two major advantages. It can evaluate the uncertainty of tasks, which is expected to contribute to precise exploration, and it can also improve its performance by updating a belief. We show that our proposed method outperforms existing methods through experiments with a meta-RL benchmark.

Cite

Text

Imagawa et al. "Off-Policy Meta-Reinforcement Learning Based on Feature Embedding Spaces." ICML 2020 Workshops: LifelongML, 2020.

Markdown

[Imagawa et al. "Off-Policy Meta-Reinforcement Learning Based on Feature Embedding Spaces." ICML 2020 Workshops: LifelongML, 2020.](https://mlanthology.org/icmlw/2020/imagawa2020icmlw-offpolicy/)

BibTeX

@inproceedings{imagawa2020icmlw-offpolicy,
  title     = {{Off-Policy Meta-Reinforcement Learning Based on Feature Embedding Spaces}},
  author    = {Imagawa, Takahisa and Hiraoka, Takuya and Tsuruoka, Yoshimasa},
  booktitle = {ICML 2020 Workshops: LifelongML},
  year      = {2020},
  url       = {https://mlanthology.org/icmlw/2020/imagawa2020icmlw-offpolicy/}
}