Video-Guided Skill Discovery

Abstract

We study how embodied agents can use passive data, such as videos, to guide the discovery of useful and diverse skills. Existing datasets have the potential to be an abundant and rich source of examples for robot learning, revealing not only what tasks to do, but also how to achieve them. Without structural priors, existing approaches to skill discovery are often underspecified and ineffective in real-world, high-DoF settings. Our approach uses the temporal information in videos to learn structured representations of the world that can then be used to create shaped rewards for efficiently learning from open-ended play and fine-tuning to target tasks. We demonstrate the ability to effectively learn skills by leveraging action-free video data in a kitchen manipulation setting and on synthetic control tasks.

Cite

Text

Tomar et al. "Video-Guided Skill Discovery." ICML 2023 Workshops: MFPL, 2023.

Markdown

[Tomar et al. "Video-Guided Skill Discovery." ICML 2023 Workshops: MFPL, 2023.](https://mlanthology.org/icmlw/2023/tomar2023icmlw-videoguided/)

BibTeX

@inproceedings{tomar2023icmlw-videoguided,
  title     = {{Video-Guided Skill Discovery}},
  author    = {Tomar, Manan and Ghosh, Dibya and Myers, Vivek and Dragan, Anca and Taylor, Matthew E. and Bachman, Philip and Levine, Sergey},
  booktitle = {ICML 2023 Workshops: MFPL},
  year      = {2023},
  url       = {https://mlanthology.org/icmlw/2023/tomar2023icmlw-videoguided/}
}