Learning Visually Guided Latent Actions for Assistive Teleoperation
Abstract
It is challenging for humans — particularly people living with physical disabilities — to control high-dimensional and dexterous robots. Prior work explores how robots can learn embedding functions that map a human’s low-dimensional inputs (e.g., via a joystick) to complex, high-dimensional robot actions for assistive teleoperation; unfortunately, there are many more high-dimensional actions than available low-dimensional inputs! To extract the correct action and maximally assist their human controller, robots must reason over their current context: for example, pressing a joystick right when interacting with a coffee cup indicates a different action than when interacting with food. In this work, we develop assistive robots that condition their latent embeddings on visual inputs. We explore a spectrum of plausible visual encoders and show that incorporating object detectors pretrained on a small amount of cheap and easy-to-collect structured data enables i) accurately and robustly recognizing the current context and ii) generalizing control embeddings to new objects and tasks. In user studies with a high-dimensional physical robot arm, participants leverage this approach to perform new tasks with unseen objects. Our results indicate that structured visual representations improves few-shot performance and is subjectively preferred by users.
Cite
Text
Karamcheti et al. "Learning Visually Guided Latent Actions for Assistive Teleoperation." Proceedings of the 3rd Conference on Learning for Dynamics and Control, 2021.Markdown
[Karamcheti et al. "Learning Visually Guided Latent Actions for Assistive Teleoperation." Proceedings of the 3rd Conference on Learning for Dynamics and Control, 2021.](https://mlanthology.org/l4dc/2021/karamcheti2021l4dc-learning/)BibTeX
@inproceedings{karamcheti2021l4dc-learning,
title = {{Learning Visually Guided Latent Actions for Assistive Teleoperation}},
author = {Karamcheti, Siddharth and Zhai, Albert J. and Losey, Dylan P. and Sadigh, Dorsa},
booktitle = {Proceedings of the 3rd Conference on Learning for Dynamics and Control},
year = {2021},
pages = {1230-1241},
volume = {144},
url = {https://mlanthology.org/l4dc/2021/karamcheti2021l4dc-learning/}
}