Generalized Relation Modeling for Transformer Tracking
Abstract
Compared with previous two-stream trackers, the recent one-stream tracking pipeline, which allows earlier interaction between the template and search region, has achieved a remarkable performance gain. However, existing one-stream trackers always let the template interact with all parts inside the search region throughout all the encoder layers. This could potentially lead to target-background confusion when the extracted feature representations are not sufficiently discriminative. To alleviate this issue, we propose a generalized relation modeling method based on adaptive token division. The proposed method is a generalized formulation of attention-based relation modeling for Transformer tracking, which inherits the merits of both previous two-stream and one-stream pipelines whilst enabling more flexible relation modeling by selecting appropriate search tokens to interact with template tokens. An attention masking strategy and the Gumbel-Softmax technique are introduced to facilitate the parallel computation and end-to-end learning of the token division module. Extensive experiments show that our method is superior to the two-stream and one-stream pipelines and achieves state-of-the-art performance on six challenging benchmarks with a real-time running speed.
Cite
Text
Gao et al. "Generalized Relation Modeling for Transformer Tracking." Conference on Computer Vision and Pattern Recognition, 2023. doi:10.1109/CVPR52729.2023.01792Markdown
[Gao et al. "Generalized Relation Modeling for Transformer Tracking." Conference on Computer Vision and Pattern Recognition, 2023.](https://mlanthology.org/cvpr/2023/gao2023cvpr-generalized/) doi:10.1109/CVPR52729.2023.01792BibTeX
@inproceedings{gao2023cvpr-generalized,
title = {{Generalized Relation Modeling for Transformer Tracking}},
author = {Gao, Shenyuan and Zhou, Chunluan and Zhang, Jun},
booktitle = {Conference on Computer Vision and Pattern Recognition},
year = {2023},
pages = {18686-18695},
doi = {10.1109/CVPR52729.2023.01792},
url = {https://mlanthology.org/cvpr/2023/gao2023cvpr-generalized/}
}