On Pre-Training for Visuo-Motor Control: Revisiting a Learning-from-Scratch Baseline
Abstract
In this paper, we examine the effectiveness of pre-training for visuo-motor control tasks. We revisit a simple Learning-from-Scratch (LfS) baseline that incorporates data augmentation and a shallow ConvNet, and find that this baseline is surprisingly competitive with recent approaches (PVR, MVP, R3M) that leverage frozen visual representations trained on large-scale vision datasets – across a variety of algorithms, task domains, and metrics in simulation and on a real robot. Our results demonstrate that these methods are hindered by a significant domain gap between the pre-training datasets and current benchmarks for visuo-motor control, which is alleviated by finetuning. Based on our findings, we provide recommendations for future research in pre-training for control and hope that our simple yet strong baseline will aid in accurately benchmarking progress in this area. Code: https://github.com/gemcollector/learning-from-scratch.
Cite
Text
Hansen et al. "On Pre-Training for Visuo-Motor Control: Revisiting a Learning-from-Scratch Baseline." International Conference on Machine Learning, 2023.Markdown
[Hansen et al. "On Pre-Training for Visuo-Motor Control: Revisiting a Learning-from-Scratch Baseline." International Conference on Machine Learning, 2023.](https://mlanthology.org/icml/2023/hansen2023icml-pretraining/)BibTeX
@inproceedings{hansen2023icml-pretraining,
title = {{On Pre-Training for Visuo-Motor Control: Revisiting a Learning-from-Scratch Baseline}},
author = {Hansen, Nicklas and Yuan, Zhecheng and Ze, Yanjie and Mu, Tongzhou and Rajeswaran, Aravind and Su, Hao and Xu, Huazhe and Wang, Xiaolong},
booktitle = {International Conference on Machine Learning},
year = {2023},
pages = {12511-12526},
volume = {202},
url = {https://mlanthology.org/icml/2023/hansen2023icml-pretraining/}
}