Rethinking Visual Geo-Localization for Large-Scale Applications
Abstract
Visual Geo-localization (VG) is the task of estimating the position where a given photo was taken by comparing it with a large database of images of known locations. To investigate how existing techniques would perform on a real-world city-wide VG application, we build San Francisco eXtra Large, a new dataset covering a whole city and providing a wide range of challenging cases, with a size 30x bigger than the previous largest dataset for visual geo-localization. We find that current methods fail to scale to such large datasets, therefore we design a new highly scalable training technique, called CosPlace, which casts the training as a classification problem avoiding the expensive mining needed by the commonly used contrastive learning. We achieve state-of-the-art performance on a wide range of datasets, and find that CosPlace is robust to heavy domain changes. Moreover, we show that, compared to previous state of the art, CosPlace requires roughly 80% less GPU memory at train time and achieves better results with 8x smaller descriptors, paving the way for city-wide real-world visual geo-localization. Dataset, code and trained models are available for research purposes at https://github.com/gmberton/CosPlace.
Cite
Text
Berton et al. "Rethinking Visual Geo-Localization for Large-Scale Applications." Conference on Computer Vision and Pattern Recognition, 2022. doi:10.1109/CVPR52688.2022.00483Markdown
[Berton et al. "Rethinking Visual Geo-Localization for Large-Scale Applications." Conference on Computer Vision and Pattern Recognition, 2022.](https://mlanthology.org/cvpr/2022/berton2022cvpr-rethinking/) doi:10.1109/CVPR52688.2022.00483BibTeX
@inproceedings{berton2022cvpr-rethinking,
title = {{Rethinking Visual Geo-Localization for Large-Scale Applications}},
author = {Berton, Gabriele and Masone, Carlo and Caputo, Barbara},
booktitle = {Conference on Computer Vision and Pattern Recognition},
year = {2022},
pages = {4878-4888},
doi = {10.1109/CVPR52688.2022.00483},
url = {https://mlanthology.org/cvpr/2022/berton2022cvpr-rethinking/}
}