RGBD Objects in the Wild: Scaling Real-World 3D Object Learning from RGB-D Videos

Abstract

We introduce a new RGB-D object dataset captured in the wild called WildRGB-D. Unlike most existing real-world object-centric datasets which only come with RGB capturing the direct capture of the depth channel allows better 3D annotations and broader downstream applications. WildRGB-D comprises large-scale category-level RGB-D object videos which are taken using an iPhone to go around the objects in 360 degrees. It contains around 8500 recorded objects and nearly 20000 RGB-D videos across 46 common object categories. These videos are taken with diverse cluttered backgrounds with three setups to cover as many real-world scenarios as possible: (i) a single object in one video; (ii) multiple objects in one video; and (iii) an object with a static hand in one video. The dataset is annotated with object masks real-world scale camera poses and reconstructed aggregated point clouds from RGBD videos. We benchmark four tasks with WildRGB-D including novel view synthesis camera pose estimation object 6d pose estimation and object surface reconstruction. Our experiments show that the large-scale capture of RGB-D objects provides a large potential to advance 3D object learning. Our project page is https://wildrgbd.github.io/.

Cite

Text

Xia et al. "RGBD Objects in the Wild: Scaling Real-World 3D Object Learning from RGB-D Videos." Conference on Computer Vision and Pattern Recognition, 2024. doi:10.1109/CVPR52733.2024.02112

Markdown

[Xia et al. "RGBD Objects in the Wild: Scaling Real-World 3D Object Learning from RGB-D Videos." Conference on Computer Vision and Pattern Recognition, 2024.](https://mlanthology.org/cvpr/2024/xia2024cvpr-rgbd/) doi:10.1109/CVPR52733.2024.02112

BibTeX

@inproceedings{xia2024cvpr-rgbd,
  title     = {{RGBD Objects in the Wild: Scaling Real-World 3D Object Learning from RGB-D Videos}},
  author    = {Xia, Hongchi and Fu, Yang and Liu, Sifei and Wang, Xiaolong},
  booktitle = {Conference on Computer Vision and Pattern Recognition},
  year      = {2024},
  pages     = {22378-22389},
  doi       = {10.1109/CVPR52733.2024.02112},
  url       = {https://mlanthology.org/cvpr/2024/xia2024cvpr-rgbd/}
}