Cuboids Revisited: Learning Robust 3D Shape Fitting to Single RGB Images

Kluger, Florian; Ackermann, Hanno; Brachmann, Eric; Yang, Michael Ying; Rosenhahn, Bodo

doi:10.1109/CVPR46437.2021.01287

Cuboids Revisited: Learning Robust 3D Shape Fitting to Single RGB Images

Florian Kluger, Hanno Ackermann, Eric Brachmann, Michael Ying Yang, Bodo Rosenhahn

CVPR 2021 pp. 13070-13079

doi:10.1109/CVPR46437.2021.01287 /cvpr/2021/kluger2021cvpr-cuboids/

Abstract

Humans perceive and construct the surrounding world as an arrangement of simple parametric models. In particular, man-made environments commonly consist of volumetric primitives such as cuboids or cylinders. Inferring these primitives is an important step to attain high-level, abstract scene descriptions. Previous approaches directly estimate shape parameters from a 2D or 3D input, and are only able to reproduce simple objects, yet unable to accurately parse more complex 3D scenes. In contrast, we propose a robust estimator for primitive fitting, which can meaningfully abstract real-world environments using cuboids. A RANSAC estimator guided by a neural network fits these primitives to 3D features, such as a depth map. We condition the network on previously detected parts of the scene, thus parsing it one-by-one. To obtain 3D features from a single RGB image, we additionally optimise a feature extraction CNN in an end-to-end manner. However, naively minimising point-to-primitive distances leads to large or spurious cuboids occluding parts of the scene behind. We thus propose an occlusion-aware distance metric correctly handling opaque scenes. The proposed algorithm does not require labour-intensive labels, such as cuboid annotations, for training. Results on the challenging NYU Depth v2 dataset demonstrate that the proposed algorithm successfully abstracts cluttered real-world 3D scene layouts.

PDF CVPR Semantic Scholar

Cite

Text

Kluger et al. "Cuboids Revisited: Learning Robust 3D Shape Fitting to Single RGB Images." Conference on Computer Vision and Pattern Recognition, 2021. doi:10.1109/CVPR46437.2021.01287

Markdown

[Kluger et al. "Cuboids Revisited: Learning Robust 3D Shape Fitting to Single RGB Images." Conference on Computer Vision and Pattern Recognition, 2021.](https://mlanthology.org/cvpr/2021/kluger2021cvpr-cuboids/) doi:10.1109/CVPR46437.2021.01287

BibTeX

@inproceedings{kluger2021cvpr-cuboids,
  title     = {{Cuboids Revisited: Learning Robust 3D Shape Fitting to Single RGB Images}},
  author    = {Kluger, Florian and Ackermann, Hanno and Brachmann, Eric and Yang, Michael Ying and Rosenhahn, Bodo},
  booktitle = {Conference on Computer Vision and Pattern Recognition},
  year      = {2021},
  pages     = {13070-13079},
  doi       = {10.1109/CVPR46437.2021.01287},
  url       = {https://mlanthology.org/cvpr/2021/kluger2021cvpr-cuboids/}
}