LILA: Language-Informed Latent Actions

Abstract

We introduce Language-Informed Latent Actions (LILA), a framework for learning natural language interfaces in the context of human-robot collaboration. LILA falls under the shared autonomy paradigm: in addition to providing discrete language inputs, humans are given a low-dimensional controller – e.g., a 2 degree-of-freedom (DoF) joystick that can move left/right and up/down – for operating the robot. LILA learns to use language to modulate this controller, providing users with a language-informed control space: given an instruction like "place the cereal bowl on the tray," LILA may learn a 2-DoF space where one dimension controls the distance from the robot’s end-effector to the bowl, and the other dimension controls the robot’s end-effector pose relative to the grasp point on the bowl. We evaluate LILA with real-world user studies, where users can provide a language instruction while operating a 7-DoF Franka Emika Panda Arm to complete a series of complex manipulation tasks. We show that LILA models are not only more sample efficient and performant than imitation learning and end-effector control baselines, but that they are also qualitatively preferred by users.

Cite

Text

Karamcheti et al. "LILA: Language-Informed Latent Actions." Conference on Robot Learning, 2021.

Markdown

[Karamcheti et al. "LILA: Language-Informed Latent Actions." Conference on Robot Learning, 2021.](https://mlanthology.org/corl/2021/karamcheti2021corl-lila/)

BibTeX

@inproceedings{karamcheti2021corl-lila,
  title     = {{LILA: Language-Informed Latent Actions}},
  author    = {Karamcheti, Siddharth and Srivastava, Megha and Liang, Percy and Sadigh, Dorsa},
  booktitle = {Conference on Robot Learning},
  year      = {2021},
  pages     = {1379-1390},
  volume    = {164},
  url       = {https://mlanthology.org/corl/2021/karamcheti2021corl-lila/}
}