Teaching Robots with Show and Tell: Using Foundation Models to Synthesize Robot Policies from Language and Visual Demonstration

Michael Murray, Abhishek Gupta, Maya Cakmak

CoRL 2024 pp. 4033-4050

/corl/2024/murray2024corl-teaching/

Abstract

We introduce a modular, neuro-symbolic framework for teaching robots new skills through language and visual demonstration. Our approach, ShowTell, composes a mixture of foundation models to synthesize robot manipulation programs that are easy to interpret and generalize across a wide range of tasks and environments. ShowTell is designed to handle complex demonstrations involving high level logic such as loops and conditionals while being intuitive and natural for end-users. We validate this approach through a series of real-world robot experiments, showing that ShowTell out-performs a state-of-the-art baseline based on GPT4-V, on a variety of tasks, and that it is able to generalize to unseen environments and within category objects.

PDF CoRL OpenReview Semantic Scholar

Cite

Text

Murray et al. "Teaching Robots with Show and Tell: Using Foundation Models to Synthesize Robot Policies from Language and Visual Demonstration." Proceedings of The 8th Conference on Robot Learning, 2024.

Markdown

[Murray et al. "Teaching Robots with Show and Tell: Using Foundation Models to Synthesize Robot Policies from Language and Visual Demonstration." Proceedings of The 8th Conference on Robot Learning, 2024.](https://mlanthology.org/corl/2024/murray2024corl-teaching/)

BibTeX

@inproceedings{murray2024corl-teaching,
  title     = {{Teaching Robots with Show and Tell: Using Foundation Models to Synthesize Robot Policies from Language and Visual Demonstration}},
  author    = {Murray, Michael and Gupta, Abhishek and Cakmak, Maya},
  booktitle = {Proceedings of The 8th Conference on Robot Learning},
  year      = {2024},
  pages     = {4033-4050},
  volume    = {270},
  url       = {https://mlanthology.org/corl/2024/murray2024corl-teaching/}
}