Learning to Reconstruct 3D Manhattan Wireframes from a Single Image

Abstract

From a single view of an urban environment, we propose a method to effectively exploit the global structural regularities for obtaining a compact, accurate, and intuitive 3D wireframe representation. Our method trains a single convolutional neural network to simultaneously detect salient junctions and straight lines, as well as predict their 3D depth and vanishing points. Compared with state-of-the-art learning-based wireframe detection methods, our network is much simpler and more unified, leading to better 2D wireframe detection. With a global structural prior (such as Manhattan assumption), our method further reconstructs a full 3D wireframe model, a compact vector representation suitable for a variety of high-level vision tasks such as AR and CAD. We conduct extensive evaluations of our method on a large new synthetic dataset of urban scenes as well as real images. Our code and datasets will be published along with the paper.

Cite

Text

Zhou et al. "Learning to Reconstruct 3D Manhattan Wireframes from a Single Image." Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019. doi:10.1109/ICCV.2019.00779

Markdown

[Zhou et al. "Learning to Reconstruct 3D Manhattan Wireframes from a Single Image." Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019.](https://mlanthology.org/iccv/2019/zhou2019iccv-learning/) doi:10.1109/ICCV.2019.00779

BibTeX

@inproceedings{zhou2019iccv-learning,
  title     = {{Learning to Reconstruct 3D Manhattan Wireframes from a Single Image}},
  author    = {Zhou, Yichao and Qi, Haozhi and Zhai, Yuexiang and Sun, Qi and Chen, Zhili and Wei, Li-Yi and Ma, Yi},
  booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision},
  year      = {2019},
  doi       = {10.1109/ICCV.2019.00779},
  url       = {https://mlanthology.org/iccv/2019/zhou2019iccv-learning/}
}