Learning Modal-Invariant and Temporal-Memory for Video-Based Visible-Infrared Person Re-Identification

Abstract

Thanks for the cross-modal retrieval techniques, visible-infrared (RGB-IR) person re-identification (Re-ID) is achieved by projecting them into a common space, allowing person Re-ID in 24-hour surveillance systems. However, with respect to the "probe-to-gallery", almost all existing RGB-IR based cross-modal person Re-ID methods focus on image-to-image matching, while the video-to-video matching which contains much richer spatial- and temporal-information remains under-explored. In this paper, we primarily study the video-based cross-modal person Re-ID method. To achieve this task, a video-based RGB-IR dataset is constructed, in which 927 valid identities with 463,259 frames and 21,863 tracklets captured by 12 RGB/IR cameras are collected. Based on our constructed dataset, we prove that with the increase of frames in a tracklet, the performance does meet more enhancement, demonstrating the significance of video-to-video matching in RGB-IR person Re-ID. Additionally, a novel method is further proposed, which not only projects two modalities to a modal-invariant subspace, but also extracts the temporal-memory for motion-invariant. Thanks to these two strategies, much better results are achieved on our video-based cross-modal person Re-ID. The code is released at: https://github.com/VCM-project233/MITML.

Cite

Text

Lin et al. "Learning Modal-Invariant and Temporal-Memory for Video-Based Visible-Infrared Person Re-Identification." Conference on Computer Vision and Pattern Recognition, 2022. doi:10.1109/CVPR52688.2022.02030

Markdown

[Lin et al. "Learning Modal-Invariant and Temporal-Memory for Video-Based Visible-Infrared Person Re-Identification." Conference on Computer Vision and Pattern Recognition, 2022.](https://mlanthology.org/cvpr/2022/lin2022cvpr-learning-a/) doi:10.1109/CVPR52688.2022.02030

BibTeX

@inproceedings{lin2022cvpr-learning-a,
  title     = {{Learning Modal-Invariant and Temporal-Memory for Video-Based Visible-Infrared Person Re-Identification}},
  author    = {Lin, Xinyu and Li, Jinxing and Ma, Zeyu and Li, Huafeng and Li, Shuang and Xu, Kaixiong and Lu, Guangming and Zhang, David},
  booktitle = {Conference on Computer Vision and Pattern Recognition},
  year      = {2022},
  pages     = {20973-20982},
  doi       = {10.1109/CVPR52688.2022.02030},
  url       = {https://mlanthology.org/cvpr/2022/lin2022cvpr-learning-a/}
}