Learning a Key-Value Memory Co-Attention Matching Network for Person Re-Identification
Abstract
Person re-identification (Re-ID) is typically cast as the problem of semantic representation and alignment, which requires precisely discovering and modeling the inherent spatial structure information on person images. Motivated by this observation, we propose a Key-Value Memory Matching Network (KVM-MN) model that consists of key-value memory representation and key-value co-attention matching. The proposed KVM-MN model is capable of building an effective local-position-aware person representation that encodes the spatial feature information in the form of multi-head key-value memory. Furthermore, the proposed KVM-MN model makes use of multi-head co-attention to automatically learn a number of cross-person-matching patterns, resulting in more robust and interpretable matching results. Finally, we build a setwise learning mechanism that implements a more generalized query-to-gallery-image-set learning procedure. Experimental results demonstrate the effectiveness of the proposed model against the state-of-the-art.
Cite
Text
Zhang et al. "Learning a Key-Value Memory Co-Attention Matching Network for Person Re-Identification." AAAI Conference on Artificial Intelligence, 2019. doi:10.1609/AAAI.V33I01.33019235Markdown
[Zhang et al. "Learning a Key-Value Memory Co-Attention Matching Network for Person Re-Identification." AAAI Conference on Artificial Intelligence, 2019.](https://mlanthology.org/aaai/2019/zhang2019aaai-learning-d/) doi:10.1609/AAAI.V33I01.33019235BibTeX
@inproceedings{zhang2019aaai-learning-d,
title = {{Learning a Key-Value Memory Co-Attention Matching Network for Person Re-Identification}},
author = {Zhang, Yaqing and Li, Xi and Zhang, Zhongfei},
booktitle = {AAAI Conference on Artificial Intelligence},
year = {2019},
pages = {9235-9242},
doi = {10.1609/AAAI.V33I01.33019235},
url = {https://mlanthology.org/aaai/2019/zhang2019aaai-learning-d/}
}