PhoCaL: A Multi-Modal Dataset for Category-Level Object Pose Estimation with Photometrically Challenging Objects

Abstract

Object pose estimation is crucial for robotic applications and augmented reality. Beyond instance level 6D object pose estimation methods, estimating category-level pose and shape has become a promising trend. As such, a new research field needs to be supported by well-designed datasets. To provide a benchmark with high-quality ground truth annotations to the community, we introduce a multimodal dataset for category-level object pose estimation with photometrically challenging objects termed PhoCaL. PhoCaL comprises 60 high quality 3D models of household objects over 8 categories including highly reflective, transparent and symmetric objects. We developed a novel robot-supported multi-modal (RGB, depth, polarisation) data acquisition and annotation process. It ensures sub-millimeter accuracy of the pose for opaque textured, shiny and transparent objects, no motion blur and perfect camera synchronisation. To set a benchmark for our dataset, state-of-the-art RGB-D and monocular RGB methods are evaluated on the challenging scenes of PhoCaL.

Cite

Text

Wang et al. "PhoCaL: A Multi-Modal Dataset for Category-Level Object Pose Estimation with Photometrically Challenging Objects." Conference on Computer Vision and Pattern Recognition, 2022. doi:10.1109/CVPR52688.2022.02054

Markdown

[Wang et al. "PhoCaL: A Multi-Modal Dataset for Category-Level Object Pose Estimation with Photometrically Challenging Objects." Conference on Computer Vision and Pattern Recognition, 2022.](https://mlanthology.org/cvpr/2022/wang2022cvpr-phocal/) doi:10.1109/CVPR52688.2022.02054

BibTeX

@inproceedings{wang2022cvpr-phocal,
  title     = {{PhoCaL: A Multi-Modal Dataset for Category-Level Object Pose Estimation with Photometrically Challenging Objects}},
  author    = {Wang, Pengyuan and Jung, HyunJun and Li, Yitong and Shen, Siyuan and Srikanth, Rahul Parthasarathy and Garattoni, Lorenzo and Meier, Sven and Navab, Nassir and Busam, Benjamin},
  booktitle = {Conference on Computer Vision and Pattern Recognition},
  year      = {2022},
  pages     = {21222-21231},
  doi       = {10.1109/CVPR52688.2022.02054},
  url       = {https://mlanthology.org/cvpr/2022/wang2022cvpr-phocal/}
}