Enhancing Visual Localization with Cross-Domain Image Generation

Abstract

Visual localization aims to predict the absolute camera pose for a single query image. However, predominant methods focus on single-camera images and scenes with limited appearance variations, limiting their applicability to cross-domain scenes commonly encountered in real-world applications. Furthermore, the long-tail distribution of cross-domain datasets poses additional challenges for visual localization. In this work, we propose a novel cross-domain data generation method to enhance visual localization methods. To achieve this, we first construct a cross-domain 3DGS to accurately model photometric variations and mitigate the interference of dynamic objects in large-scale scenes. We introduce a text-guided image editing model to enhance data diversity for addressing the long-tail distribution problem and design an effective fine-tuning strategy for it. Then, we develop an anchor-based method to generate high-quality datasets for visual localization. Finally, we introduce positional attention to address data ambiguities in cross-camera images. Extensive experiments show that our method achieves state-of-the-art accuracy, outperforming existing cross-domain visual localization methods by an average of 59% across all domains. Project page: https://yzwang-sjtu.github.io/CDG-Loc.

Cite

Text

Wang et al. "Enhancing Visual Localization with Cross-Domain Image Generation." Proceedings of the 42nd International Conference on Machine Learning, 2025.

Markdown

[Wang et al. "Enhancing Visual Localization with Cross-Domain Image Generation." Proceedings of the 42nd International Conference on Machine Learning, 2025.](https://mlanthology.org/icml/2025/wang2025icml-enhancing-b/)

BibTeX

@inproceedings{wang2025icml-enhancing-b,
  title     = {{Enhancing Visual Localization with Cross-Domain Image Generation}},
  author    = {Wang, Yuanze and Yan, Yichao and Song, Shiming and Jin, Songchang and Huang, Yilan and Sheng, Xingdong and Shi, Dianxi},
  booktitle = {Proceedings of the 42nd International Conference on Machine Learning},
  year      = {2025},
  pages     = {65075-65090},
  volume    = {267},
  url       = {https://mlanthology.org/icml/2025/wang2025icml-enhancing-b/}
}