Mastering Visual Continuous Control: Improved Data-Augmented Reinforcement Learning
Abstract
We present DrQ-v2, a model-free reinforcement learning (RL) algorithm for visual continuous control. DrQ-v2 builds on DrQ, an off-policy actor-critic approach that uses data augmentation to learn directly from pixels. We introduce several improvements that yield state-of-the-art results on the DeepMind Control Suite. Notably, DrQ-v2 is able to solve complex humanoid locomotion tasks directly from pixel observations, previously unattained by model-free RL. DrQ-v2 is conceptually simple, easy to implement, and provides significantly better computational footprint compared to prior work, with the majority of tasks taking just 8 hours to train on a single GPU. Finally, we publicly release DrQ-v2's implementation to provide RL practitioners with a strong and computationally efficient baseline.
Cite
Text
Yarats et al. "Mastering Visual Continuous Control: Improved Data-Augmented Reinforcement Learning." NeurIPS 2021 Workshops: DeepRL, 2021.Markdown
[Yarats et al. "Mastering Visual Continuous Control: Improved Data-Augmented Reinforcement Learning." NeurIPS 2021 Workshops: DeepRL, 2021.](https://mlanthology.org/neuripsw/2021/yarats2021neuripsw-mastering/)BibTeX
@inproceedings{yarats2021neuripsw-mastering,
title = {{Mastering Visual Continuous Control: Improved Data-Augmented Reinforcement Learning}},
author = {Yarats, Denis and Fergus, Rob and Lazaric, Alessandro and Pinto, Lerrel},
booktitle = {NeurIPS 2021 Workshops: DeepRL},
year = {2021},
url = {https://mlanthology.org/neuripsw/2021/yarats2021neuripsw-mastering/}
}