FARTrack: Fast Autoregressive Visual Tracking with High Performance

Abstract

Inference speed and tracking performance are two critical evaluation metrics in the field of visual tracking. However, high-performance trackers often suffer from slow processing speeds, making them impractical for deployment on resource-constrained devices. To alleviate this issue, we propose $\textbf{FARTrack}$, a $\textbf{F}$ast $\textbf{A}$uto-$\textbf{R}$egressive $\textbf{T}$racking framework. Since autoregression emphasizes the temporal nature of the trajectory sequence, it can maintain high performance while achieving efficient execution across various devices. FARTrack introduces $\textbf{Task-Specific Self-Distillation}$ and $\textbf{Inter-frame Autoregressive Sparsification}$, designed from the perspectives of $\textbf{shallow-yet-accurate distillation}$ and $\textbf{redundant-to-essential token optimization}$, respectively. Task-Specific Self-Distillation achieves model compression by distilling task-specific tokens layer by layer, enhancing the model's inference speed while avoiding suboptimal manual teacher-student layer pairs assignments. Meanwhile, Inter-frame Autoregressive Sparsification sequentially condenses multiple templates, avoiding additional runtime overhead while learning a temporally-global optimal sparsification strategy. FARTrack demonstrates outstanding speed and competitive performance. It delivers an AO of 70.6\% on GOT-10k in real-time. Beyond, our fastest model achieves a speed of 343 FPS on the GPU and 121 FPS on the CPU. Source code is available at: https://github.com/MIV-XJTU/FARTrack.git

Cite

Text

Wang et al. "FARTrack: Fast Autoregressive Visual Tracking with High Performance." International Conference on Learning Representations, 2026.

Markdown

[Wang et al. "FARTrack: Fast Autoregressive Visual Tracking with High Performance." International Conference on Learning Representations, 2026.](https://mlanthology.org/iclr/2026/wang2026iclr-fartrack/)

BibTeX

@inproceedings{wang2026iclr-fartrack,
  title     = {{FARTrack: Fast Autoregressive Visual Tracking with High Performance}},
  author    = {Wang, Guijie and Lin, Tong and Bai, Yifan and Cao, Anjia and Liang, Shiyi and Zhao, Wangbo and Wei, Xing},
  booktitle = {International Conference on Learning Representations},
  year      = {2026},
  url       = {https://mlanthology.org/iclr/2026/wang2026iclr-fartrack/}
}