Adaptive and Background-Aware Vision Transformer for Real-Time UAV Tracking
Abstract
While discriminative correlation filters (DCF)-based trackers prevail in UAV tracking for their favorable efficiency, lightweight convolutional neural network (CNN)-based trackers using filter pruning have also demonstrated remarkable efficiency and precision. However, the use of pure vision transformer models (ViTs) for UAV tracking remains unexplored, which is a surprising finding given that ViTs have been shown to produce better performance and greater efficiency than CNNs in image classification. In this paper, we propose an efficient ViT-based tracking framework, Aba-ViTrack, for UAV tracking. In our framework, feature learning and template-search coupling are integrated into an efficient one-stream ViT to avoid an extra heavy relation modeling module. The proposed Aba-ViT exploits an adaptive and background-aware token computation method to reduce inference time. This approach adaptively discards tokens based on learned halting probabilities, which a priori are higher for background tokens than target ones. Extensive experiments on six UAV tracking benchmarks demonstrate that the proposed Aba-ViTrack achieves state-of-the-art performance in UAV tracking. Code is available at https://github.com/xyyang317/Aba-ViTrack.
Cite
Text
Li et al. "Adaptive and Background-Aware Vision Transformer for Real-Time UAV Tracking." International Conference on Computer Vision, 2023. doi:10.1109/ICCV51070.2023.01286Markdown
[Li et al. "Adaptive and Background-Aware Vision Transformer for Real-Time UAV Tracking." International Conference on Computer Vision, 2023.](https://mlanthology.org/iccv/2023/li2023iccv-adaptive/) doi:10.1109/ICCV51070.2023.01286BibTeX
@inproceedings{li2023iccv-adaptive,
title = {{Adaptive and Background-Aware Vision Transformer for Real-Time UAV Tracking}},
author = {Li, Shuiwang and Yang, Yangxiang and Zeng, Dan and Wang, Xucheng},
booktitle = {International Conference on Computer Vision},
year = {2023},
pages = {13989-14000},
doi = {10.1109/ICCV51070.2023.01286},
url = {https://mlanthology.org/iccv/2023/li2023iccv-adaptive/}
}