Statewide Visual Geolocalization in the Wild
Abstract
This work presents a method that is able to predict the geolocation of a street-view photo taken in the wild within a state-sized search region by matching against a database of aerial reference imagery. We partition the search region into geographical cells and train a model to map cells and corresponding photos into a joint embedding space that is used to perform retrieval at test time. The model utilizes aerial images for each cell at multiple levels-of-detail to provide sufficient information about the surrounding scene. We propose a novel layout of the search region with consistent cell resolutions that allows scaling to large geographical regions. Experiments demonstrate that the method successfully localizes 60.6% of all non-panoramic street-view photos uploaded to the crowd-sourcing platform Mapillary in the state of Massachusetts to within 50m of their ground-truth location. Source code is available at https://github.com/fferflo/statewide-visu
Cite
Text
Fervers et al. "Statewide Visual Geolocalization in the Wild." Proceedings of the European Conference on Computer Vision (ECCV), 2024. doi:10.1007/978-3-031-72764-1_25Markdown
[Fervers et al. "Statewide Visual Geolocalization in the Wild." Proceedings of the European Conference on Computer Vision (ECCV), 2024.](https://mlanthology.org/eccv/2024/fervers2024eccv-statewide/) doi:10.1007/978-3-031-72764-1_25BibTeX
@inproceedings{fervers2024eccv-statewide,
title = {{Statewide Visual Geolocalization in the Wild}},
author = {Fervers, Florian and Bullinger, Sebastian and Bodensteiner, Christoph and Arens, Michael and Stiefelhagen, Rainer},
booktitle = {Proceedings of the European Conference on Computer Vision (ECCV)},
year = {2024},
doi = {10.1007/978-3-031-72764-1_25},
url = {https://mlanthology.org/eccv/2024/fervers2024eccv-statewide/}
}