Pretraining Reward-Free Representations for Data-Efficient Reinforcement Learning

Abstract

Data efficiency poses a major challenge for deep reinforcement learning. We approach this issue from the perspective of self-supervised representation learning, leveraging reward-free exploratory data to pretrain encoder networks. We employ a novel combination of latent dynamics modelling and goal-reaching objectives, which exploit the inherent structure of data in reinforcement learning. We demonstrate that our method scales well with network capacity and pretraining data. When evaluated on the Atari 100k data-efficiency benchmark, our approach significantly outperforms previous methods combining unsupervised pretraining with task-specific finetuning, and approaches human-level performance.

Cite

Text

Anonymous. "Pretraining Reward-Free Representations for Data-Efficient Reinforcement Learning." ICLR 2021 Workshops: SSL-RL, 2021.

Markdown

[Anonymous. "Pretraining Reward-Free Representations for Data-Efficient Reinforcement Learning." ICLR 2021 Workshops: SSL-RL, 2021.](https://mlanthology.org/iclrw/2021/anonymous2021iclrw-pretraining/)

BibTeX

@inproceedings{anonymous2021iclrw-pretraining,
  title     = {{Pretraining Reward-Free Representations for Data-Efficient Reinforcement Learning}},
  author    = {Anonymous, },
  booktitle = {ICLR 2021 Workshops: SSL-RL},
  year      = {2021},
  url       = {https://mlanthology.org/iclrw/2021/anonymous2021iclrw-pretraining/}
}