MoDem: Accelerating Visual Model-Based Reinforcement Learning with Demonstrations

Nicklas Hansen, Yixin Lin, Hao Su, Xiaolong Wang, Vikash Kumar, Aravind Rajeswaran

NeurIPSW 2022

/neuripsw/2022/hansen2022neuripsw-modem/

Abstract

Poor sample efficiency continues to be the primary challenge for deployment of deep Reinforcement Learning (RL) algorithms for real-world applications, and in particular for visuo-motor control. Model-based RL has the potential to be highly sample efficient by concurrently learning a world model and using synthetic rollouts for planning and policy improvement. However, in practice, sample-efficient learning with model-based RL is bottlenecked by the exploration challenge. In this work, we find that leveraging just a handful of demonstrations can dramatically improve the sample-efficiency of model-based RL. Simply appending demonstrations to the interaction dataset, however, does not suffice. We identify key ingredients for leveraging demonstrations in model learning -- policy pretraining, targeted exploration, and oversampling of demonstration data -- which forms the three phases of our model-based RL framework. We empirically study three complex visuo-motor control domains and find that our method is 160%-250% more successful in completing sparse reward tasks compared to prior approaches in the low data regime (100K interaction steps, 5 demonstrations). Code and videos are available at: https://nicklashansen.github.io/modemrl.

PDF NeurIPSW OpenReview Semantic Scholar

Cite

Text

Hansen et al. "MoDem: Accelerating Visual Model-Based Reinforcement Learning with Demonstrations." NeurIPS 2022 Workshops: DeepRL, 2022.

Markdown

[Hansen et al. "MoDem: Accelerating Visual Model-Based Reinforcement Learning with Demonstrations." NeurIPS 2022 Workshops: DeepRL, 2022.](https://mlanthology.org/neuripsw/2022/hansen2022neuripsw-modem/)

BibTeX

@inproceedings{hansen2022neuripsw-modem,
  title     = {{MoDem: Accelerating Visual Model-Based Reinforcement Learning with Demonstrations}},
  author    = {Hansen, Nicklas and Lin, Yixin and Su, Hao and Wang, Xiaolong and Kumar, Vikash and Rajeswaran, Aravind},
  booktitle = {NeurIPS 2022 Workshops: DeepRL},
  year      = {2022},
  url       = {https://mlanthology.org/neuripsw/2022/hansen2022neuripsw-modem/}
}