ETO:Efficient Transformer-Based Local Feature Matching by Organizing Multiple Homography Hypotheses
Abstract
We tackle the efficiency problem of learning local feature matching.Recent advancements have given rise to purely CNN-based and transformer-based approaches, each augmented with deep learning techniques. While CNN-based methods often excel in matching speed, transformer-based methods tend to provide more accurate matches. We propose an efficient transformer-based network architecture for local feature matching.This technique is built on constructing multiple homography hypotheses to approximate the continuous correspondence in the real world and uni-directional cross-attention to accelerate the refinement. On the YFCC100M dataset, our matching accuracy is competitive with LoFTR, a state-of-the-art transformer-based architecture, while the inference speed is boosted to 4 times, even outperforming the CNN-based methods.Comprehensive evaluations on other open datasets such as Megadepth, ScanNet, and HPatches demonstrate our method's efficacy, highlighting its potential to significantly enhance a wide array of downstream applications.
Cite
Text
Ni et al. "ETO:Efficient Transformer-Based Local Feature Matching by Organizing Multiple Homography Hypotheses." Neural Information Processing Systems, 2024. doi:10.52202/079017-1926Markdown
[Ni et al. "ETO:Efficient Transformer-Based Local Feature Matching by Organizing Multiple Homography Hypotheses." Neural Information Processing Systems, 2024.](https://mlanthology.org/neurips/2024/ni2024neurips-eto/) doi:10.52202/079017-1926BibTeX
@inproceedings{ni2024neurips-eto,
title = {{ETO:Efficient Transformer-Based Local Feature Matching by Organizing Multiple Homography Hypotheses}},
author = {Ni, Junjie and Zhang, Guofeng and Li, Guanglin and Li, Yijin and Liu, Xinyang and Huang, Zhaoyang and Bao, Hujun},
booktitle = {Neural Information Processing Systems},
year = {2024},
doi = {10.52202/079017-1926},
url = {https://mlanthology.org/neurips/2024/ni2024neurips-eto/}
}