Efficient Structured Prediction for 3D Indoor Scene Understanding
Abstract
Existing approaches to indoor scene understanding formulate the problem as a structured prediction task focusing on estimating the 3D bounding box which best describes the scene layout. Unfortunately, these approaches utilize high order potentials which are computationally intractable and rely on ad-hoc approximations for both learning and inference. In this paper we show that the potentials commonly used in the literature can be decomposed into pair-wise potentials by extending the concept of integral images to geometry. As a consequence no heuristic reduction of the search space is required. In practice, this results in large improvements in performance over the state-of-the-art, while being orders of magnitude faster.
Cite
Text
Schwing et al. "Efficient Structured Prediction for 3D Indoor Scene Understanding." IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2012. doi:10.1109/CVPR.2012.6248006Markdown
[Schwing et al. "Efficient Structured Prediction for 3D Indoor Scene Understanding." IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2012.](https://mlanthology.org/cvpr/2012/schwing2012cvpr-efficient/) doi:10.1109/CVPR.2012.6248006BibTeX
@inproceedings{schwing2012cvpr-efficient,
title = {{Efficient Structured Prediction for 3D Indoor Scene Understanding}},
author = {Schwing, Alexander G. and Hazan, Tamir and Pollefeys, Marc and Urtasun, Raquel},
booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition},
year = {2012},
pages = {2815-2822},
doi = {10.1109/CVPR.2012.6248006},
url = {https://mlanthology.org/cvpr/2012/schwing2012cvpr-efficient/}
}