Temporal Aggregation with CLIP-Level Attention for Video-Based Person Re-Identification

Abstract

Video-based person re-identification (Re-ID) methods can extract richer features than image-based ones from short video clips. The existing methods usually apply simple strategies, such as average/max pooling, to obtain the tracklet-level features, which has been proved hard to aggregate the information from all video frames. In this paper, we propose a simple yet effective Temporal Aggregation with Clip-level Attention Network (TACAN) to solve the temporal aggregation problem in a hierarchal way. Specifically, a tracklet is firstly broken into different numbers of clips, through a two-stage temporal aggregation network we can get the tracklet-level feature representation. A novel min-max loss is introduced to learn both a clip-level attention extractor and a clip-level feature representer in the training process. Afterwards, the resulting clip-level weights are further taken to average the clip-level features, which can generate a robust tracklet-level feature representation at the testing stage. Experimental results on four benchmark datasets, including the MARS, iLIDS-VID, PRID-2011 and DukeMTMC-VideoReID, show that our TACAN has achieved significant improvements as compared with the state-of-the-art approaches.

Cite

Text

Li et al. "Temporal Aggregation with CLIP-Level Attention for Video-Based Person Re-Identification." Winter Conference on Applications of Computer Vision, 2020.

Markdown

[Li et al. "Temporal Aggregation with CLIP-Level Attention for Video-Based Person Re-Identification." Winter Conference on Applications of Computer Vision, 2020.](https://mlanthology.org/wacv/2020/li2020wacv-temporal/)

BibTeX

@inproceedings{li2020wacv-temporal,
  title     = {{Temporal Aggregation with CLIP-Level Attention for Video-Based Person Re-Identification}},
  author    = {Li, Mengliu and Xu, Han and Wang, Jinjun and Li, Wenpeng and Sun, Yongli},
  booktitle = {Winter Conference on Applications of Computer Vision},
  year      = {2020},
  url       = {https://mlanthology.org/wacv/2020/li2020wacv-temporal/}
}