Affine-Based Deformable Attention and Selective Fusion for Semi-Dense Matching

Abstract

Identifying robust and accurate correspondences across images is a fundamental problem in computer vision that enables various downstream tasks. Recent semi-dense matching methods emphasize the effectiveness of fusing relevant cross-view information through Transformer. In this paper, we propose several improvements upon this paradigm. Firstly, we introduce affine-based local attention to model cross-view deformations. Secondly, we present selective fusion to merge local and global messages from cross attention. Apart from network structure, we also identify the importance of enforcing spatial smoothness in loss design, which has been omitted by previous works. Based on these augmentations, our network demonstrate strong matching capacity under different settings. The full version of our network achieves state-of-the-art performance among semi-dense matching methods at a similar cost to LoFTR, while the slim version reaches LoFTR baseline’s performance with only 15% computation cost and 18% parameters.

Cite

Text

Chen et al. "Affine-Based Deformable Attention and Selective Fusion for Semi-Dense Matching." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2024. doi:10.1109/CVPRW63382.2024.00429

Markdown

[Chen et al. "Affine-Based Deformable Attention and Selective Fusion for Semi-Dense Matching." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2024.](https://mlanthology.org/cvprw/2024/chen2024cvprw-affinebased/) doi:10.1109/CVPRW63382.2024.00429

BibTeX

@inproceedings{chen2024cvprw-affinebased,
  title     = {{Affine-Based Deformable Attention and Selective Fusion for Semi-Dense Matching}},
  author    = {Chen, Hongkai and Luo, Zixin and Tian, Yurun and Bai, Xuyang and Wang, Ziyu and Zhou, Lei and Zhen, Mingmin and Fang, Tian and McKinnon, David and Tsin, Yanghai and Quan, Long},
  booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops},
  year      = {2024},
  pages     = {4254-4263},
  doi       = {10.1109/CVPRW63382.2024.00429},
  url       = {https://mlanthology.org/cvprw/2024/chen2024cvprw-affinebased/}
}