Holistic 3D Scene Understanding from a Single Geo-Tagged Image

Abstract

In this paper we are interested in exploiting geographic priors to help outdoor scene understanding. Towards this goal we propose a holistic approach that reasons jointly about 3D object detection, pose estimation, semantic segmentation as well as depth reconstruction from a single image. Our approach takes advantage of large-scale crowd-sourced maps to generate dense geographic, geometric and semantic priors by rendering the 3D world. We demonstrate the effectiveness of our holistic model on the challenging KITTI dataset, and show significant improvements over the baselines in all metrics and tasks.

Cite

Text

Wang et al. "Holistic 3D Scene Understanding from a Single Geo-Tagged Image." Conference on Computer Vision and Pattern Recognition, 2015. doi:10.1109/CVPR.2015.7299022

Markdown

[Wang et al. "Holistic 3D Scene Understanding from a Single Geo-Tagged Image." Conference on Computer Vision and Pattern Recognition, 2015.](https://mlanthology.org/cvpr/2015/wang2015cvpr-holistic/) doi:10.1109/CVPR.2015.7299022

BibTeX

@inproceedings{wang2015cvpr-holistic,
  title     = {{Holistic 3D Scene Understanding from a Single Geo-Tagged Image}},
  author    = {Wang, Shenlong and Fidler, Sanja and Urtasun, Raquel},
  booktitle = {Conference on Computer Vision and Pattern Recognition},
  year      = {2015},
  doi       = {10.1109/CVPR.2015.7299022},
  url       = {https://mlanthology.org/cvpr/2015/wang2015cvpr-holistic/}
}