Command-Driven Articulated Object Understanding and Manipulation

Abstract

We present Cart, a new approach towards articulated-object manipulations by human commands. Beyond the existing work that focuses on inferring articulation structures, we further support manipulating articulated shapes to align them subject to simple command templates. The key of Cart is to utilize the prediction of object structures to connect visual observations with user commands for effective manipulations. It is achieved by encoding command messages for motion prediction and a test-time adaptation to adjust the amount of movement from only command supervision. For a rich variety of object categories, Cart can accurately manipulate object shapes and outperform the state-of-the-art approaches in understanding the inherent articulation structures. Also, it can well generalize to unseen object categories and real-world objects. We hope Cart could open new directions for instructing machines to operate articulated objects.

Cite

Text

Chu et al. "Command-Driven Articulated Object Understanding and Manipulation." Conference on Computer Vision and Pattern Recognition, 2023. doi:10.1109/CVPR52729.2023.00851

Markdown

[Chu et al. "Command-Driven Articulated Object Understanding and Manipulation." Conference on Computer Vision and Pattern Recognition, 2023.](https://mlanthology.org/cvpr/2023/chu2023cvpr-commanddriven/) doi:10.1109/CVPR52729.2023.00851

BibTeX

@inproceedings{chu2023cvpr-commanddriven,
  title     = {{Command-Driven Articulated Object Understanding and Manipulation}},
  author    = {Chu, Ruihang and Liu, Zhengzhe and Ye, Xiaoqing and Tan, Xiao and Qi, Xiaojuan and Fu, Chi-Wing and Jia, Jiaya},
  booktitle = {Conference on Computer Vision and Pattern Recognition},
  year      = {2023},
  pages     = {8813-8823},
  doi       = {10.1109/CVPR52729.2023.00851},
  url       = {https://mlanthology.org/cvpr/2023/chu2023cvpr-commanddriven/}
}