Hierarchical Feature Embedding for Visual Tracking

Abstract

Features extracted by existing tracking methods may contain instance- and category-level information. However, it usually occurs that either instance- or category-level information uncontrollably dominates the feature embeddings depending on the training data distribution, since the two types of information are not explicitly modeled. A more favorable way is to produce features that emphasize both types of information in visual tracking. To achieve this, we propose a hierarchical feature embedding model which separately learns the instance and category information, and progressively embeds them. We develop the instance-aware and category-aware modules that collaborate from different semantic levels to produce discriminative and robust feature embeddings. The instance-aware module concentrates on the instance level in which the inter-video contrastive learning mechanism is adopted to facilitate inter-instance separability and intra-instance compactness. However, it is challenging to force the intra-instance compactness by using instance-level information alone because of the prevailing appearance changes of the instance in visual tracking. To tackle this problem, the category-aware module is employed to summarize high-level category information which remains robust despite instance-level appearance changes. As such, intra-instance compactness can be effectively improved by jointly leveraging the instance- and category-aware modules. Experimental results on various tracking benchmarks demonstrate that the proposed method performs favorably against the state-of-the-arts.

Cite

Text

Pi et al. "Hierarchical Feature Embedding for Visual Tracking." Proceedings of the European Conference on Computer Vision (ECCV), 2022. doi:10.1007/978-3-031-20047-2_25

Markdown

[Pi et al. "Hierarchical Feature Embedding for Visual Tracking." Proceedings of the European Conference on Computer Vision (ECCV), 2022.](https://mlanthology.org/eccv/2022/pi2022eccv-hierarchical/) doi:10.1007/978-3-031-20047-2_25

BibTeX

@inproceedings{pi2022eccv-hierarchical,
  title     = {{Hierarchical Feature Embedding for Visual Tracking}},
  author    = {Pi, Zhixiong and Wan, Weitao and Sun, Chong and Gao, Changxin and Sang, Nong and Li, Chen},
  booktitle = {Proceedings of the European Conference on Computer Vision (ECCV)},
  year      = {2022},
  doi       = {10.1007/978-3-031-20047-2_25},
  url       = {https://mlanthology.org/eccv/2022/pi2022eccv-hierarchical/}
}