Single View Scene Scale Estimation Using Scale Field

Abstract

In this paper, we propose a single image scale estimation method based on a novel scale field representation. A scale field defines the local pixel-to-metric conversion ratio along the gravity direction on all the ground pixels. This representation resolves the ambiguity in camera parameters, allowing us to use a simple yet effective way to collect scale annotations on arbitrary images from human annotators. By training our model on calibrated panoramic image data and the in-the-wild human annotated data, our single image scene scale estimation network generates robust scale field on a variety of image, which can be utilized in various 3D understanding and scale-aware image editing applications.

Cite

Text

Lee et al. "Single View Scene Scale Estimation Using Scale Field." Conference on Computer Vision and Pattern Recognition, 2023. doi:10.1109/CVPR52729.2023.02053

Markdown

[Lee et al. "Single View Scene Scale Estimation Using Scale Field." Conference on Computer Vision and Pattern Recognition, 2023.](https://mlanthology.org/cvpr/2023/lee2023cvpr-single/) doi:10.1109/CVPR52729.2023.02053

BibTeX

@inproceedings{lee2023cvpr-single,
  title     = {{Single View Scene Scale Estimation Using Scale Field}},
  author    = {Lee, Byeong-Uk and Zhang, Jianming and Hold-Geoffroy, Yannick and Kweon, In So},
  booktitle = {Conference on Computer Vision and Pattern Recognition},
  year      = {2023},
  pages     = {21435-21444},
  doi       = {10.1109/CVPR52729.2023.02053},
  url       = {https://mlanthology.org/cvpr/2023/lee2023cvpr-single/}
}