DexMV: Imitation Learning for Dexterous Manipulation from Human Videos

Yuzhe Qin, Yueh-Hua Wu, Shaowei Liu, Hanwen Jiang, Ruihan Yang, Yang Fu, Xiaolong Wang

ECCV 2022

doi:10.1007/978-3-031-19842-7_33 /eccv/2022/qin2022eccv-dexmv/

Abstract

While in computer vision we have made significant progress on understanding hand-object interactions, it is still very challenging for robots to perform complex dexterous manipulation. In this paper, we propose a new platform and pipeline, DexMV (Dexterous Manipulation from Videos), for imitation learning to bridge the gap between computer vision and robot learning. We design a platform with: (i) a simulation system for complex dexterous manipulation tasks with a multi-finger robot hand and (ii) a computer vision system to record large-scale demonstrations of a human hand conducting the same tasks. In the DexMV pipeline, we couple 3D hand and object pose estimation on the videos with hand motion retargeting algorithm, to extract the hand-object state trajectories. We compare multiple imitation learning and reinforcement learning (RL) algorithms on the manipulation tasks in the simulation. We show that the demonstrations can indeed improve robot learning by a large margin and solve the complex tasks which RL alone cannot solve.

PDF ECCV Semantic Scholar

Cite

Text

Qin et al. "DexMV: Imitation Learning for Dexterous Manipulation from Human Videos." Proceedings of the European Conference on Computer Vision (ECCV), 2022. doi:10.1007/978-3-031-19842-7_33

Markdown

[Qin et al. "DexMV: Imitation Learning for Dexterous Manipulation from Human Videos." Proceedings of the European Conference on Computer Vision (ECCV), 2022.](https://mlanthology.org/eccv/2022/qin2022eccv-dexmv/) doi:10.1007/978-3-031-19842-7_33

BibTeX

@inproceedings{qin2022eccv-dexmv,
  title     = {{DexMV: Imitation Learning for Dexterous Manipulation from Human Videos}},
  author    = {Qin, Yuzhe and Wu, Yueh-Hua and Liu, Shaowei and Jiang, Hanwen and Yang, Ruihan and Fu, Yang and Wang, Xiaolong},
  booktitle = {Proceedings of the European Conference on Computer Vision (ECCV)},
  year      = {2022},
  doi       = {10.1007/978-3-031-19842-7_33},
  url       = {https://mlanthology.org/eccv/2022/qin2022eccv-dexmv/}
}