Breaking the Frame: Visual Place Recognition by Overlap Prediction

Abstract

Visual place recognition methods struggle with occlusion and partial visual overlaps. We propose a novel visual place recognition approach based on overlap prediction called VOP shifting from traditional reliance on global image similarities and local features to image overlap prediction. VOP proceeds co-visible image sections by obtaining patch-level embeddings using a Vision Transformer backbone and establishing patch-to-patch correspondences without requiring expensive feature detection and matching. Our approach uses a voting mechanism to assess overlap scores for potential database images. It provides a nuanced image retrieval metric in challenging scenarios. Experimental results show that VOP leads to more accurate relative pose estimation and localization results on the retrieved image pairs than state-of-the-art baselines on a number of large-scale real-world indoor and outdoor benchmarks. The code is available at https://github.com/weitong8591/vop.git.

Cite

Text

Wei et al. "Breaking the Frame: Visual Place Recognition by Overlap Prediction." Winter Conference on Applications of Computer Vision, 2025.

Markdown

[Wei et al. "Breaking the Frame: Visual Place Recognition by Overlap Prediction." Winter Conference on Applications of Computer Vision, 2025.](https://mlanthology.org/wacv/2025/wei2025wacv-breaking/)

BibTeX

@inproceedings{wei2025wacv-breaking,
  title     = {{Breaking the Frame: Visual Place Recognition by Overlap Prediction}},
  author    = {Wei, Tong and Lindenberger, Philipp and Matas, Jirí and Barath, Daniel},
  booktitle = {Winter Conference on Applications of Computer Vision},
  year      = {2025},
  pages     = {2322-2331},
  url       = {https://mlanthology.org/wacv/2025/wei2025wacv-breaking/}
}