SceneGraphLoc: Cross-Modal Coarse Visual Localization on 3D Scene Graphs

Yang Miao, Francis Engelmann, Olga Vysotska, Federico Tombari, Marc Pollefeys, Daniel Barath

ECCV 2024

doi:10.1007/978-3-031-73242-3_8 /eccv/2024/miao2024eccv-scenegraphloc/

Abstract

We introduce the task of localizing an input image within a multi-modal reference map represented by a collection of 3D scene graphs. These scene graphs comprise multiple modalities, including object-level point clouds, images, attributes, and relationships between objects, offering a lightweight and efficient alternative to conventional methods that rely on extensive image databases. Given these modalities, the proposed method learns a fixed-sized embedding for each node (, representing object instances) in the scene graph, enabling effective matching with the objects visible in the input query image. This strategy significantly outperforms other cross-modal methods, even without incorporating images into the map representation. With images, achieves performance close to that of state-of-the-art techniques depending on large image databases, while requiring three orders-of-magnitude less storage and operating orders-of-magnitude faster. Code and models are available at https://scenegraphloc.github.io.

PDF ECCV Semantic Scholar

Cite

Text

Miao et al. "SceneGraphLoc: Cross-Modal Coarse Visual Localization on 3D Scene Graphs." Proceedings of the European Conference on Computer Vision (ECCV), 2024. doi:10.1007/978-3-031-73242-3_8

Markdown

[Miao et al. "SceneGraphLoc: Cross-Modal Coarse Visual Localization on 3D Scene Graphs." Proceedings of the European Conference on Computer Vision (ECCV), 2024.](https://mlanthology.org/eccv/2024/miao2024eccv-scenegraphloc/) doi:10.1007/978-3-031-73242-3_8

BibTeX

@inproceedings{miao2024eccv-scenegraphloc,
  title     = {{SceneGraphLoc: Cross-Modal Coarse Visual Localization on 3D Scene Graphs}},
  author    = {Miao, Yang and Engelmann, Francis and Vysotska, Olga and Tombari, Federico and Pollefeys, Marc and Barath, Daniel},
  booktitle = {Proceedings of the European Conference on Computer Vision (ECCV)},
  year      = {2024},
  doi       = {10.1007/978-3-031-73242-3_8},
  url       = {https://mlanthology.org/eccv/2024/miao2024eccv-scenegraphloc/}
}