Unsupervised Curricula for Visual Meta-Reinforcement Learning

Abstract

In principle, meta-reinforcement learning algorithms leverage experience across many tasks to learn fast and effective reinforcement learning (RL) strategies. However, current meta-RL approaches rely on manually-defined distributions of training tasks, and hand-crafting these task distributions can be challenging and time-consuming. Can ``useful'' pre-training tasks be discovered in an unsupervised manner? We develop an unsupervised algorithm for inducing an adaptive meta-training task distribution, i.e. an automatic curriculum, by modeling unsupervised interaction in a visual environment. The task distribution is scaffolded by a parametric density model of the meta-learner's trajectory distribution. We formulate unsupervised meta-RL as information maximization between a latent task variable and the meta-learner’s data distribution, and describe a practical instantiation which alternates between integration of recent experience into the task distribution and meta-learning of the updated tasks. Repeating this procedure leads to iterative reorganization such that the curriculum adapts as the meta-learner's data distribution shifts. Moreover, we show how discriminative clustering frameworks for visual representations can support trajectory-level task acquisition and exploration in domains with pixel observations, avoiding the pitfalls of alternatives. In experiments on vision-based navigation and manipulation domains, we show that the algorithm allows for unsupervised meta-learning that both transfers to downstream tasks specified by hand-crafted reward functions and serves as pre-training for more efficient meta-learning of test task distributions.

Cite

Text

Jabri et al. "Unsupervised Curricula for Visual Meta-Reinforcement Learning." Neural Information Processing Systems, 2019.

Markdown

[Jabri et al. "Unsupervised Curricula for Visual Meta-Reinforcement Learning." Neural Information Processing Systems, 2019.](https://mlanthology.org/neurips/2019/jabri2019neurips-unsupervised/)

BibTeX

@inproceedings{jabri2019neurips-unsupervised,
  title     = {{Unsupervised Curricula for Visual Meta-Reinforcement Learning}},
  author    = {Jabri, Allan and Hsu, Kyle and Gupta, Abhishek and Eysenbach, Ben and Levine, Sergey and Finn, Chelsea},
  booktitle = {Neural Information Processing Systems},
  year      = {2019},
  pages     = {10519-10531},
  url       = {https://mlanthology.org/neurips/2019/jabri2019neurips-unsupervised/}
}