OmniCity: Omnipotent City Understanding with Multi-Level and Multi-View Images

Abstract

This paper presents OmniCity, a new dataset for omnipotent city understanding from multi-level and multi-view images. More precisely, OmniCity contains multi-view satellite images as well as street-level panorama and mono-view images, constituting over 100K pixel-wise annotated images that are well-aligned and collected from 25K geo-locations in New York City. To alleviate the substantial pixel-wise annotation efforts, we propose an efficient street-view image annotation pipeline that leverages the existing label maps of satellite view and the transformation relations between different views (satellite, panorama, and mono-view). With the new OmniCity dataset, we provide benchmarks for a variety of tasks including building footprint extraction, height estimation, and building plane/instance/fine-grained segmentation. Compared with existing multi-level and multi-view benchmarks, OmniCity contains a larger number of images with richer annotation types and more views, provides more benchmark results of state-of-the-art models, and introduces a new task for fine-grained building instance segmentation on street-level panorama images. Moreover, OmniCity provides new problem settings for existing tasks, such as cross-view image matching, synthesis, segmentation, detection, etc., and facilitates the developing of new methods for large-scale city understanding, reconstruction, and simulation. The OmniCity dataset as well as the benchmarks will be released at https://city-super.github.io/omnicity/.

Cite

Text

Li et al. "OmniCity: Omnipotent City Understanding with Multi-Level and Multi-View Images." Conference on Computer Vision and Pattern Recognition, 2023. doi:10.1109/CVPR52729.2023.01669

Markdown

[Li et al. "OmniCity: Omnipotent City Understanding with Multi-Level and Multi-View Images." Conference on Computer Vision and Pattern Recognition, 2023.](https://mlanthology.org/cvpr/2023/li2023cvpr-omnicity/) doi:10.1109/CVPR52729.2023.01669

BibTeX

@inproceedings{li2023cvpr-omnicity,
  title     = {{OmniCity: Omnipotent City Understanding with Multi-Level and Multi-View Images}},
  author    = {Li, Weijia and Lai, Yawen and Xu, Linning and Xiangli, Yuanbo and Yu, Jinhua and He, Conghui and Xia, Gui-Song and Lin, Dahua},
  booktitle = {Conference on Computer Vision and Pattern Recognition},
  year      = {2023},
  pages     = {17397-17407},
  doi       = {10.1109/CVPR52729.2023.01669},
  url       = {https://mlanthology.org/cvpr/2023/li2023cvpr-omnicity/}
}