A Twofold Siamese Network for Real-Time Object Tracking

Abstract

Observing that Semantic features learned in an image classification task and Appearance features learned in a similarity matching task complement each other, we build a twofold Siamese network, named SA-Siam, for real-time object tracking. SA-Siam is composed of a semantic branch and an appearance branch. Each branch is a similarity learning Siamese network. An important design choice in SA-Siam is to separately train the two branches to keep the heterogeneity of the two types of features. In addition, we propose a channel attention mechanism for the semantic branch. Channel-wise weights are computed according to the channel activations around the target position. While the inherited architecture from SiamFC allows our tracker to operate beyond real-time, the twofold design and the attention mechanism significantly improve the tracking performance. The proposed SA-Siam outperforms all other real-time trackers by a large margin on OTB-2013/50/100 benchmarks.

Cite

Text

He et al. "A Twofold Siamese Network for Real-Time Object Tracking." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018. doi:10.1109/CVPR.2018.00508

Markdown

[He et al. "A Twofold Siamese Network for Real-Time Object Tracking." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018.](https://mlanthology.org/cvpr/2018/he2018cvpr-twofold/) doi:10.1109/CVPR.2018.00508

BibTeX

@inproceedings{he2018cvpr-twofold,
  title     = {{A Twofold Siamese Network for Real-Time Object Tracking}},
  author    = {He, Anfeng and Luo, Chong and Tian, Xinmei and Zeng, Wenjun},
  booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  year      = {2018},
  doi       = {10.1109/CVPR.2018.00508},
  url       = {https://mlanthology.org/cvpr/2018/he2018cvpr-twofold/}
}