CAFF-DINO: Multi-Spectral Object Detection Transformers with Cross-Attention Features Fusion

Abstract

Object detection on images can find benefit from coupling multiple spectra, each presenting specific useful features. However, building an efficient architecture coupling the different modalities is a complex task. Transformers, due to their ability to extract meaningful correlations between the different regions of the inputs appear as a promising way to perform features fusion across different spectra. This work presents a multi-spectral object detection architecture based on cross-attention features fusion (CAFF), combined with a transformer based detector (DINO). We demonstrate here the performance of the proposed approach in object detection compared with state-of-the-art approaches, on infrared-visible multi-spectral datasets. Moreover the robustness to systematic misalignment between image pairs is studied. The proposed approach is generic to any mono-spectrum transformer based detectors. The model developed in this study will be available in a dedicated github repository.

Cite

Text

Helvig et al. "CAFF-DINO: Multi-Spectral Object Detection Transformers with Cross-Attention Features Fusion." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2024. doi:10.1109/CVPRW63382.2024.00309

Markdown

[Helvig et al. "CAFF-DINO: Multi-Spectral Object Detection Transformers with Cross-Attention Features Fusion." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2024.](https://mlanthology.org/cvprw/2024/helvig2024cvprw-caffdino/) doi:10.1109/CVPRW63382.2024.00309

BibTeX

@inproceedings{helvig2024cvprw-caffdino,
  title     = {{CAFF-DINO: Multi-Spectral Object Detection Transformers with Cross-Attention Features Fusion}},
  author    = {Helvig, Kevin and Abeloos, Baptiste and Trouvé-Peloux, Pauline},
  booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops},
  year      = {2024},
  pages     = {3037-3046},
  doi       = {10.1109/CVPRW63382.2024.00309},
  url       = {https://mlanthology.org/cvprw/2024/helvig2024cvprw-caffdino/}
}