VinT-6d: A Large-Scale Object-in-Hand Dataset from Vision, Touch and Proprioception

Abstract

This paper addresses the scarcity of large-scale datasets for accurate object-in-hand pose estimation, which is crucial for robotic in-hand manipulation within the "Perception-Planning-Control" paradigm. Specifically, we introduce VinT-6D, the first extensive multi-modal dataset integrating vision, touch, and proprioception, to enhance robotic manipulation. VinT-6D comprises 2 million VinT-Sim and 0.1 million VinT-Real entries, collected via simulations in Mujoco and Blender and a custom-designed real-world platform. This dataset is tailored for robotic hands, offering models with whole-hand tactile perception and high-quality, well-aligned data. To the best of our knowledge, the VinT-Real is the largest considering the collection difficulties in the real-world environment so it can bridge the gap of simulation to real compared to the previous works. Built upon VinT-6D, we present a benchmark method that shows significant improvements in performance by fusing multi-modal information. The project is available at https://VinT-6D.github.io/.

Cite

Text

Wan et al. "VinT-6d: A Large-Scale Object-in-Hand Dataset from Vision, Touch and Proprioception." International Conference on Machine Learning, 2024.

Markdown

[Wan et al. "VinT-6d: A Large-Scale Object-in-Hand Dataset from Vision, Touch and Proprioception." International Conference on Machine Learning, 2024.](https://mlanthology.org/icml/2024/wan2024icml-vint6d/)

BibTeX

@inproceedings{wan2024icml-vint6d,
  title     = {{VinT-6d: A Large-Scale Object-in-Hand Dataset from Vision, Touch and Proprioception}},
  author    = {Wan, Zhaoliang and Ling, Yonggen and Yi, Senlin and Qi, Lu and Lee, Wang Wei and Lu, Minglei and Yang, Sicheng and Teng, Xiao and Lu, Peng and Yang, Xu and Yang, Ming-Hsuan and Cheng, Hui},
  booktitle = {International Conference on Machine Learning},
  year      = {2024},
  pages     = {49921-49940},
  volume    = {235},
  url       = {https://mlanthology.org/icml/2024/wan2024icml-vint6d/}
}