SegMASt3R: Geometry Grounded Segment Matching
Abstract
Segment matching is an important intermediate task in computer vision that establishes correspondences between semantically or geometrically coherent regions across images. Unlike keypoint matching, which focuses on localized features, segment matching captures structured regions, offering greater robustness to occlusions, lighting variations, and viewpoint changes. In this paper, we leverage the spatial understanding of 3D foundation models to tackle wide-baseline segment matching, a challenging setting involving extreme viewpoint shifts. We propose an architecture that uses the inductive bias of these 3D foundation models to match segments across image pairs with up to $180^\circ$ rotation. Extensive experiments show that our approach outperforms state-of-the-art methods, including the SAM2 video propagator and local feature matching methods, by up to 30\% on the AUPRC metric, on ScanNet++ and Replica datasets. We further demonstrate benefits of the proposed model on relevant downstream tasks, including 3D instance mapping and object-relative navigation.
Cite
Text
Jayanti et al. "SegMASt3R: Geometry Grounded Segment Matching." Advances in Neural Information Processing Systems, 2025.Markdown
[Jayanti et al. "SegMASt3R: Geometry Grounded Segment Matching." Advances in Neural Information Processing Systems, 2025.](https://mlanthology.org/neurips/2025/jayanti2025neurips-segmast3r/)BibTeX
@inproceedings{jayanti2025neurips-segmast3r,
title = {{SegMASt3R: Geometry Grounded Segment Matching}},
author = {Jayanti, Rohit and Agrawal, Swayam and Garg, Vansh and Tourani, Siddharth and Khan, Muhammad Haris and Garg, Sourav and Krishna, Madhava},
booktitle = {Advances in Neural Information Processing Systems},
year = {2025},
url = {https://mlanthology.org/neurips/2025/jayanti2025neurips-segmast3r/}
}