Open-TeleVision: Teleoperation with Immersive Active Visual Feedback

Abstract

Teleoperation serves as a powerful method for collecting on-robot data essential for robot learning from demonstrations. The intuitiveness and ease of use of the teleoperation system are crucial for ensuring high-quality, diverse, and scalable data. To achieve this, we propose an immersive teleoperation system $\textbf{Open-TeleVision}$ that allows operators to actively perceive the robot’s surroundings in a stereoscopic manner. Additionally, the system mirrors the operator’s arm and hand movements on the robot, creating an immersive experience as if the operator’s mind is transmitted to a robot embodiment. We validate the effectiveness of our system by collecting data and training imitation learning policies on four long-horizon, precise tasks (can sorting, can insertion, folding, and unloading) for 2 different humanoid robots and deploy them in the real world. The entire system will be open-sourced.

Cite

Text

Cheng et al. "Open-TeleVision: Teleoperation with Immersive Active Visual Feedback." Proceedings of The 8th Conference on Robot Learning, 2024.

Markdown

[Cheng et al. "Open-TeleVision: Teleoperation with Immersive Active Visual Feedback." Proceedings of The 8th Conference on Robot Learning, 2024.](https://mlanthology.org/corl/2024/cheng2024corl-opentelevision/)

BibTeX

@inproceedings{cheng2024corl-opentelevision,
  title     = {{Open-TeleVision: Teleoperation with Immersive Active Visual Feedback}},
  author    = {Cheng, Xuxin and Li, Jialong and Yang, Shiqi and Yang, Ge and Wang, Xiaolong},
  booktitle = {Proceedings of The 8th Conference on Robot Learning},
  year      = {2024},
  pages     = {2729-2749},
  volume    = {270},
  url       = {https://mlanthology.org/corl/2024/cheng2024corl-opentelevision/}
}