Enhancing Visual Localization with Cross-Domain Image Generation
Abstract
Visual localization aims to predict the absolute camera pose for a single query image. However, predominant methods focus on single-camera images and scenes with limited appearance variations, limiting their applicability to cross-domain scenes commonly encountered in real-world applications. Furthermore, the long-tail distribution of cross-domain datasets poses additional challenges for visual localization. In this work, we propose a novel cross-domain data generation method to enhance visual localization methods. To achieve this, we first construct a cross-domain 3DGS to accurately model photometric variations and mitigate the interference of dynamic objects in large-scale scenes. We introduce a text-guided image editing model to enhance data diversity for addressing the long-tail distribution problem and design an effective fine-tuning strategy for it. Then, we develop an anchor-based method to generate high-quality datasets for visual localization. Finally, we introduce positional attention to address data ambiguities in cross-camera images. Extensive experiments show that our method achieves state-of-the-art accuracy, outperforming existing cross-domain visual localization methods by an average of 59% across all domains. Project page: https://yzwang-sjtu.github.io/CDG-Loc.
Cite
Text
Wang et al. "Enhancing Visual Localization with Cross-Domain Image Generation." Proceedings of the 42nd International Conference on Machine Learning, 2025.Markdown
[Wang et al. "Enhancing Visual Localization with Cross-Domain Image Generation." Proceedings of the 42nd International Conference on Machine Learning, 2025.](https://mlanthology.org/icml/2025/wang2025icml-enhancing-b/)BibTeX
@inproceedings{wang2025icml-enhancing-b,
title = {{Enhancing Visual Localization with Cross-Domain Image Generation}},
author = {Wang, Yuanze and Yan, Yichao and Song, Shiming and Jin, Songchang and Huang, Yilan and Sheng, Xingdong and Shi, Dianxi},
booktitle = {Proceedings of the 42nd International Conference on Machine Learning},
year = {2025},
pages = {65075-65090},
volume = {267},
url = {https://mlanthology.org/icml/2025/wang2025icml-enhancing-b/}
}