Unsupervised Model-Based Pre-Training for Data-Efficient Reinforcement Learning from Pixels
Abstract
Reinforcement learning (RL) aims at autonomously performing complex tasks. To this end, a reward signal is used to steer the learning process. While successful in many circumstances, the approach is typically data-hungry, requiring large amounts of task-specific interaction between agent and environment to learn efficient behaviors. To alleviate this, unsupervised RL proposes to collect data through self-supervised interaction to accelerate task-specific adaptation. However, whether current unsupervised strategies lead to improved generalization capabilities is still unclear, more so when the input observations are high-dimensional. In this work, we advance the field by closing the performance gap in the Unsupervised RL Benchmark, a collection of tasks to be solved in a data-efficient manner, after interacting with the environment in a self-supervised way. Our approach uses unsupervised exploration for collecting experience to pre-train a world model. Then, when fine-tuning for downstream tasks, the agent leverages the learned model and a hybrid planner to efficiently adapt for the given tasks, achieving comparable results to task-specific baselines, while using 20x less data. We extensively evaluate our work, comparing several exploration methods and improving the fine-tuning process by studying the interactions between the learned components. Furthermore, we investigate the limitations of the pre-trained agent, gaining insights into how these influence the decision process and shedding light on new research directions.
Cite
Text
Rajeswar et al. "Unsupervised Model-Based Pre-Training for Data-Efficient Reinforcement Learning from Pixels." ICML 2022 Workshops: DARL, 2022.Markdown
[Rajeswar et al. "Unsupervised Model-Based Pre-Training for Data-Efficient Reinforcement Learning from Pixels." ICML 2022 Workshops: DARL, 2022.](https://mlanthology.org/icmlw/2022/rajeswar2022icmlw-unsupervised/)BibTeX
@inproceedings{rajeswar2022icmlw-unsupervised,
title = {{Unsupervised Model-Based Pre-Training for Data-Efficient Reinforcement Learning from Pixels}},
author = {Rajeswar, Sai and Mazzaglia, Pietro and Verbelen, Tim and Piché, Alexandre and Dhoedt, Bart and Courville, Aaron and Lacoste, Alexandre},
booktitle = {ICML 2022 Workshops: DARL},
year = {2022},
url = {https://mlanthology.org/icmlw/2022/rajeswar2022icmlw-unsupervised/}
}