Policy Adaptation via Language Optimization: Decomposing Tasks for Few-Shot Imitation

Abstract

Learned language-conditioned robot policies often struggle to effectively adapt to new real-world tasks even when pre-trained across a diverse set of instructions. We propose a novel approach for few-shot adaptation to unseen tasks that exploits the semantic understanding of task decomposition provided by vision-language models (VLMs). Our method, Policy Adaptation via Language Optimization (PALO), combines a handful of demonstrations of a task with proposed language decompositions sampled from a VLM to quickly enable rapid nonparametric adaptation, avoiding the need for a larger fine-tuning dataset. We evaluate PALO on extensive real-world experiments consisting of challenging unseen, long-horizon robot manipulation tasks. We find that PALO is able of consistently complete long-horizon, multi-tier tasks in the real world, outperforming state of the art pre-trained generalist policies, and methods that have access to the same demonstrations.

Cite

Text

Myers et al. "Policy Adaptation via Language Optimization: Decomposing Tasks for Few-Shot Imitation." Proceedings of The 8th Conference on Robot Learning, 2024.

Markdown

[Myers et al. "Policy Adaptation via Language Optimization: Decomposing Tasks for Few-Shot Imitation." Proceedings of The 8th Conference on Robot Learning, 2024.](https://mlanthology.org/corl/2024/myers2024corl-policy/)

BibTeX

@inproceedings{myers2024corl-policy,
  title     = {{Policy Adaptation via Language Optimization: Decomposing Tasks for Few-Shot Imitation}},
  author    = {Myers, Vivek and Zheng, Chunyuan and Mees, Oier and Fang, Kuan and Levine, Sergey},
  booktitle = {Proceedings of The 8th Conference on Robot Learning},
  year      = {2024},
  pages     = {1402-1426},
  volume    = {270},
  url       = {https://mlanthology.org/corl/2024/myers2024corl-policy/}
}