RT-Sketch: Goal-Conditioned Imitation Learning from Hand-Drawn Sketches
Abstract
Natural language and images are commonly used as goal representations in goal-conditioned imitation learning. However, language can be ambiguous and images can be over-specified. In this work, we study hand-drawn sketches as a modality for goal specification. Sketches can be easy to provide on the fly like language, but like images they can also help a downstream policy to be spatially-aware. By virtue of being minimal, sketches can further help disambiguate task-relevant from irrelevant objects. We present RT-Sketch, a goal-conditioned policy for manipulation that takes a hand-drawn sketch of the desired scene as input, and outputs actions. We train RT-Sketch on a dataset of trajectories paired with synthetically generated goal sketches. We evaluate this approach on six manipulation skills involving tabletop object rearrangements on an articulated countertop. Experimentally we find that RT-Sketch performs comparably to image or language-conditioned agents in straightforward settings, while achieving greater robustness when language goals are ambiguous or visual distractors are present. Additionally, we show that RT-Sketch handles sketches with varied levels of specificity, ranging from minimal line drawings to detailed, colored drawings. For supplementary material and videos, please visit http://rt-sketch.github.io.
Cite
Text
Sundaresan et al. "RT-Sketch: Goal-Conditioned Imitation Learning from Hand-Drawn Sketches." Proceedings of The 8th Conference on Robot Learning, 2024.Markdown
[Sundaresan et al. "RT-Sketch: Goal-Conditioned Imitation Learning from Hand-Drawn Sketches." Proceedings of The 8th Conference on Robot Learning, 2024.](https://mlanthology.org/corl/2024/sundaresan2024corl-rtsketch/)BibTeX
@inproceedings{sundaresan2024corl-rtsketch,
title = {{RT-Sketch: Goal-Conditioned Imitation Learning from Hand-Drawn Sketches}},
author = {Sundaresan, Priya and Vuong, Quan and Gu, Jiayuan and Xu, Peng and Xiao, Ted and Kirmani, Sean and Yu, Tianhe and Stark, Michael and Jain, Ajinkya and Hausman, Karol and Sadigh, Dorsa and Bohg, Jeannette and Schaal, Stefan},
booktitle = {Proceedings of The 8th Conference on Robot Learning},
year = {2024},
pages = {70-96},
volume = {270},
url = {https://mlanthology.org/corl/2024/sundaresan2024corl-rtsketch/}
}