Semantic Ambiguity Modeling and Propagation for Fine-Grained Visual Cross View Geo-Localization

Abstract

Visual cross view geo-localization is generally approached within a joint retrieval-and-calibration framework. However, existing methods overlook semantic ambiguities arising from query and reference images characterized by low overlap, dynamic foregrounds, viewpoint changes, and perceptual aliasing. This makes it challenging to automatically control the relative importance of the two tasks, potentially compromising the retrieval task in favor of the offset regression. Consequently, the model may encounter conflicting dominating gradients during joint training. To address this, we propose to model the semantic ambiguity during the offset regression process by integrating associated uncertainty scores, represented as 2D Gaussian distributions, to mitigate negative transfer effects within the joint tasks. We further introduce an uncertainty-aware similarity metric to enhance similarity assessment between query and reference images, accounting for their semantic ambiguities. This metric propagates uncertainty scores into the retrieval task, focusing on certain samples and learning discriminative feature embeddings, allowing the model to adaptively handle conflicting dominating gradients during joint training. Extensive experiments demonstrate that our method improves the overall performance of the joint tasks, achieving state-of-the-art results on the VIGOR and CVACT datasets.

Cite

Text

Feng et al. "Semantic Ambiguity Modeling and Propagation for Fine-Grained Visual Cross View Geo-Localization." AAAI Conference on Artificial Intelligence, 2025. doi:10.1609/AAAI.V39I3.32305

Markdown

[Feng et al. "Semantic Ambiguity Modeling and Propagation for Fine-Grained Visual Cross View Geo-Localization." AAAI Conference on Artificial Intelligence, 2025.](https://mlanthology.org/aaai/2025/feng2025aaai-semantic/) doi:10.1609/AAAI.V39I3.32305

BibTeX

@inproceedings{feng2025aaai-semantic,
  title     = {{Semantic Ambiguity Modeling and Propagation for Fine-Grained Visual Cross View Geo-Localization}},
  author    = {Feng, Mingtao and Tian, Fenghao and Luo, Jianqiao and Wu, Zijie and Dong, Weisheng and Wang, Yaonan and Mian, Ajmal Saeed},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {2025},
  pages     = {2978-2986},
  doi       = {10.1609/AAAI.V39I3.32305},
  url       = {https://mlanthology.org/aaai/2025/feng2025aaai-semantic/}
}