Cross-Modal Stealth: A Coarse-to-Fine Attack Framework for RGB-T Tracker
Abstract
Current research on adversarial attacks mainly focuses on RGB trackers, with no existing methods for attacking RGB-T cross-modal trackers. To fill this gap and overcome its challenges, we propose a progressive adversarial patch generation framework and achieve cross-modal stealth. On the one hand, we design a coarse-to-fine architecture grounded in the latent space to progressively and precisely uncover the vulnerabilities of RGB-T trackers. On the other hand, we introduce a correlation-breaking loss that disrupts the modal coupling within trackers, spanning from the pixel to the semantic level. These two design elements ensure that the proposed method can overcome the obstacles posed by cross-modal information complementarity in implementing attacks. Furthermore, to enhance the reliable application of the adversarial patches in real world, we develop a point tracking-based reprojection strategy that effectively mitigates performance degradation caused by multi-angle distortion during imaging. Extensive experiments demonstrate the superiority of our method.
Cite
Text
Xiang et al. "Cross-Modal Stealth: A Coarse-to-Fine Attack Framework for RGB-T Tracker." AAAI Conference on Artificial Intelligence, 2025. doi:10.1609/AAAI.V39I8.32931Markdown
[Xiang et al. "Cross-Modal Stealth: A Coarse-to-Fine Attack Framework for RGB-T Tracker." AAAI Conference on Artificial Intelligence, 2025.](https://mlanthology.org/aaai/2025/xiang2025aaai-cross/) doi:10.1609/AAAI.V39I8.32931BibTeX
@inproceedings{xiang2025aaai-cross,
title = {{Cross-Modal Stealth: A Coarse-to-Fine Attack Framework for RGB-T Tracker}},
author = {Xiang, Xinyu and Yan, Qinglong and Zhang, Hao and Ding, Jianfeng and Xu, Han and Wang, Zhongyuan and Ma, Jiayi},
booktitle = {AAAI Conference on Artificial Intelligence},
year = {2025},
pages = {8620-8627},
doi = {10.1609/AAAI.V39I8.32931},
url = {https://mlanthology.org/aaai/2025/xiang2025aaai-cross/}
}