Unsupervised Model-Based Pre-Training for Data-Efficient Reinforcement Learning from Pixels

Abstract

Reinforcement learning (RL) aims at autonomously performing complex tasks. To this end, a reward signal is used to steer the learning process. While successful in many circumstances, the approach is typically data-hungry, requiring large amounts of task-specific interaction between agent and environment to learn efficient behaviors. To alleviate this, unsupervised RL proposes to collect data through self-supervised interaction to accelerate task-specific adaptation. However, whether current unsupervised strategies lead to improved generalization capabilities is still unclear, more so when the input observations are high-dimensional. In this work, we advance the field by closing the performance gap in the Unsupervised RL Benchmark, a collection of tasks to be solved in a data-efficient manner, after interacting with the environment in a self-supervised way. Our approach uses unsupervised exploration for collecting experience to pre-train a world model. Then, when fine-tuning for downstream tasks, the agent leverages the learned model and a hybrid planner to efficiently adapt for the given tasks, achieving comparable results to task-specific baselines, while using 20x less data. We extensively evaluate our work, comparing several exploration methods and improving the fine-tuning process by studying the interactions between the learned components. Furthermore, we investigate the limitations of the pre-trained agent, gaining insights into how these influence the decision process and shedding light on new research directions.

Cite

Text

Rajeswar et al. "Unsupervised Model-Based Pre-Training for Data-Efficient Reinforcement Learning from Pixels." ICML 2022 Workshops: DARL, 2022.

Markdown

[Rajeswar et al. "Unsupervised Model-Based Pre-Training for Data-Efficient Reinforcement Learning from Pixels." ICML 2022 Workshops: DARL, 2022.](https://mlanthology.org/icmlw/2022/rajeswar2022icmlw-unsupervised/)

BibTeX

@inproceedings{rajeswar2022icmlw-unsupervised,
  title     = {{Unsupervised Model-Based Pre-Training for Data-Efficient Reinforcement Learning from Pixels}},
  author    = {Rajeswar, Sai and Mazzaglia, Pietro and Verbelen, Tim and Piché, Alexandre and Dhoedt, Bart and Courville, Aaron and Lacoste, Alexandre},
  booktitle = {ICML 2022 Workshops: DARL},
  year      = {2022},
  url       = {https://mlanthology.org/icmlw/2022/rajeswar2022icmlw-unsupervised/}
}