3DG-STFM: 3D Geometric Guided Student-Teacher Feature Matching

Abstract

We tackle the essential task of finding dense visual correspondences between a pair of images. This is a challenging problem due to various factors such as poor texture, repetitive patterns, illumination variation, and motion blur in practical scenarios. In contrast to methods that use dense correspondence ground-truths as direct supervision for local feature matching training, we train 3DG-STFM: a multi-modal matching model (Teacher) to enforce the depth consistency under 3D dense correspondence supervision and transfer the knowledge to 2D unimodal matching model (Student). Both teacher and student models consist of two transformer-based matching modules that obtain dense correspondences in a coarse-to-fine manner. The teacher model guides the student model to learn RGB-induced depth information for the matching purpose on both coarse and fine branches. We also evaluate 3DG-STFM on a model compression task. To the best of our knowledge, 3DG-STFM is the first student-teacher learning method for the local feature matching task. The experiments show that our method outperforms state-of-the-art methods on indoor and outdoor camera pose estimations, and homography estimation problems. Code is available at: https://github.com/Ryan-prime/3DG-STFM

Cite

Text

Mao et al. "3DG-STFM: 3D Geometric Guided Student-Teacher Feature Matching." Proceedings of the European Conference on Computer Vision (ECCV), 2022. doi:10.1007/978-3-031-19815-1_8

Markdown

[Mao et al. "3DG-STFM: 3D Geometric Guided Student-Teacher Feature Matching." Proceedings of the European Conference on Computer Vision (ECCV), 2022.](https://mlanthology.org/eccv/2022/mao2022eccv-3dgstfm/) doi:10.1007/978-3-031-19815-1_8

BibTeX

@inproceedings{mao2022eccv-3dgstfm,
  title     = {{3DG-STFM: 3D Geometric Guided Student-Teacher Feature Matching}},
  author    = {Mao, Runyu and Bai, Chen and An, Yatong and Zhu, Fengqing and Lu, Cheng},
  booktitle = {Proceedings of the European Conference on Computer Vision (ECCV)},
  year      = {2022},
  doi       = {10.1007/978-3-031-19815-1_8},
  url       = {https://mlanthology.org/eccv/2022/mao2022eccv-3dgstfm/}
}