Box in the Box: Joint 3D Layout and Object Reasoning from Single Images
Abstract
In this paper we propose an approach to jointly infer the room layout as well as the objects present in the scene. Towards this goal, we propose a branch and bound algorithm which is guaranteed to retrieve the global optimum of the joint problem. The main difficulty resides in taking into account occlusion in order to not over-count the evidence. We introduce a new decomposition method, which generalizes integral geometry to triangular shapes, and allows us to bound the different terms in constant time. We exploit both geometric cues and object detectors as image features and show large improvements in 2D and 3D object detection over state-of-the-art deformable part-based models.
Cite
Text
Schwing et al. "Box in the Box: Joint 3D Layout and Object Reasoning from Single Images." International Conference on Computer Vision, 2013. doi:10.1109/ICCV.2013.51Markdown
[Schwing et al. "Box in the Box: Joint 3D Layout and Object Reasoning from Single Images." International Conference on Computer Vision, 2013.](https://mlanthology.org/iccv/2013/schwing2013iccv-box/) doi:10.1109/ICCV.2013.51BibTeX
@inproceedings{schwing2013iccv-box,
title = {{Box in the Box: Joint 3D Layout and Object Reasoning from Single Images}},
author = {Schwing, Alexander G. and Fidler, Sanja and Pollefeys, Marc and Urtasun, Raquel},
booktitle = {International Conference on Computer Vision},
year = {2013},
doi = {10.1109/ICCV.2013.51},
url = {https://mlanthology.org/iccv/2013/schwing2013iccv-box/}
}