Global-Local Temporal Representations for Video Person Re-Identification

Li, Jianing; Wang, Jingdong; Tian, Qi; Gao, Wen; Zhang, Shiliang

doi:10.1109/ICCV.2019.00406

Global-Local Temporal Representations for Video Person Re-Identification

Jianing Li, Jingdong Wang, Qi Tian, Wen Gao, Shiliang Zhang

ICCV 2019

doi:10.1109/ICCV.2019.00406 /iccv/2019/li2019iccv-globallocal/

Abstract

This paper proposes the Global-Local Temporal Representation (GLTR) to exploit the multi-scale temporal cues in video sequences for video person Re-Identification (ReID). GLTR is constructed by first modeling the short-term temporal cues among adjacent frames, then capturing the long-term relations among inconsecutive frames. Specifically, the short-term temporal cues are modeled by parallel dilated convolutions with different temporal dilation rates to represent the motion and appearance of pedestrian. The long-term relations are captured by a temporal self-attention model to alleviate the occlusions and noises in video sequences. The short and long-term temporal cues are aggregated as the final GLTR by a simple single-stream CNN. GLTR shows substantial superiority to existing features learned with body part cues or metric learning on four widely-used video ReID datasets. For instance, it achieves Rank-1 Accuracy of 87.02% on MARS dataset without re-ranking, better than current state-of-the art.

PDF ICCV Semantic Scholar

Cite

Text

Li et al. "Global-Local Temporal Representations for Video Person Re-Identification." Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019. doi:10.1109/ICCV.2019.00406

Markdown

[Li et al. "Global-Local Temporal Representations for Video Person Re-Identification." Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019.](https://mlanthology.org/iccv/2019/li2019iccv-globallocal/) doi:10.1109/ICCV.2019.00406

BibTeX

@inproceedings{li2019iccv-globallocal,
  title     = {{Global-Local Temporal Representations for Video Person Re-Identification}},
  author    = {Li, Jianing and Wang, Jingdong and Tian, Qi and Gao, Wen and Zhang, Shiliang},
  booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision},
  year      = {2019},
  doi       = {10.1109/ICCV.2019.00406},
  url       = {https://mlanthology.org/iccv/2019/li2019iccv-globallocal/}
}