Modality Synergy Complement Learning with Cascaded Aggregation for Visible-Infrared Person Re-Identification

Abstract

Visible-Infrared Re-Identification (VI-ReID) is challenging in image retrievals. The modality discrepancy will easily make huge intra-class variations. Most existing methods either bridge different modalities through modality-invariance or generate the intermediate modality for better performance. Differently, this paper proposes a novel framework, named Modality Synergy Complement Learning Network (MSCLNet) with Cascaded Aggregation. Its basic idea is to synergize two modalities to construct diverse representations of identity-discriminative semantics and less noise. Then, we complement synergistic representations under the advantages of the two modalities. Furthermore, we propose the Cascaded Aggregation strategy for fine-grained optimization of the feature distribution, which progressively aggregates feature embeddings from the subclass, intra-class, and inter-class. Extensive experiments on SYSU-MM01 and RegDB datasets show that MSCLNet outperforms the state-of-the-art by a large margin. On the large-scale SYSU-MM01 dataset, our model can achieve 76.99% and 71.64% in terms of Rank-1 accuracy and mAP value. Our code will be available at https://github.com/bitreidgroup/VI-ReID-MSCLNet

Cite

Text

Zhang et al. "Modality Synergy Complement Learning with Cascaded Aggregation for Visible-Infrared Person Re-Identification." Proceedings of the European Conference on Computer Vision (ECCV), 2022. doi:10.1007/978-3-031-19781-9_27

Markdown

[Zhang et al. "Modality Synergy Complement Learning with Cascaded Aggregation for Visible-Infrared Person Re-Identification." Proceedings of the European Conference on Computer Vision (ECCV), 2022.](https://mlanthology.org/eccv/2022/zhang2022eccv-modality/) doi:10.1007/978-3-031-19781-9_27

BibTeX

@inproceedings{zhang2022eccv-modality,
  title     = {{Modality Synergy Complement Learning with Cascaded Aggregation for Visible-Infrared Person Re-Identification}},
  author    = {Zhang, Yiyuan and Zhao, Sanyuan and Kang, Yuhao and Shen, Jianbing},
  booktitle = {Proceedings of the European Conference on Computer Vision (ECCV)},
  year      = {2022},
  doi       = {10.1007/978-3-031-19781-9_27},
  url       = {https://mlanthology.org/eccv/2022/zhang2022eccv-modality/}
}