3D-Based Reasoning with Blocks, Support, and Stability

Abstract

3D volumetric reasoning is important for truly understanding a scene. Humans are able to both segment each object in an image, and perceive a rich 3D interpretation of the scene, e.g., the space an object occupies, which objects support other objects, and which objects would, if moved, cause other objects to fall. We propose a new approach for parsing RGB-D images using 3D block units for volumetric reasoning. The algorithm fits image segments with 3D blocks, and iteratively evaluates the scene based on block interaction properties. We produce a 3D representation of the scene based on jointly optimizing over segmentations, block fitting, supporting relations, and object stability. Our algorithm incorporates the intuition that a good 3D representation of the scene is the one that fits the data well, and is a stable, self-supporting (i.e., one that does not topple) arrangement of objects. We experiment on several datasets including controlled and real indoor scenarios. Results show that our stability-reasoning framework improves RGB-D segmentation and scene volumetric representation.

Cite

Text

Jia et al. "3D-Based Reasoning with Blocks, Support, and Stability." Conference on Computer Vision and Pattern Recognition, 2013. doi:10.1109/CVPR.2013.8

Markdown

[Jia et al. "3D-Based Reasoning with Blocks, Support, and Stability." Conference on Computer Vision and Pattern Recognition, 2013.](https://mlanthology.org/cvpr/2013/jia2013cvpr-3dbased/) doi:10.1109/CVPR.2013.8

BibTeX

@inproceedings{jia2013cvpr-3dbased,
  title     = {{3D-Based Reasoning with Blocks, Support, and Stability}},
  author    = {Jia, Zhaoyin and Gallagher, Andrew and Saxena, Ashutosh and Chen, Tsuhan},
  booktitle = {Conference on Computer Vision and Pattern Recognition},
  year      = {2013},
  doi       = {10.1109/CVPR.2013.8},
  url       = {https://mlanthology.org/cvpr/2013/jia2013cvpr-3dbased/}
}