An Improved Deep Learning Architecture for Person Re-Identification
Abstract
In this work we propose a method for simultaneously learning features and a corresponding similarity metric for person re-identification. We present a deep convolution architecture with layers specially designed to address the problem of re-identification. Given a pair of images as input, our network outputs a similarity value indicating whether the two input images depict the same person. Novel elements of our architecture include a layer that computes cross-input neighborhood differences, which capture local relationships among mid-level features that were computed separately from the two input images. A high-level summary of the outputs of this layer is computed by a layer of patch summary features, which are then spatially integrated in subsequent layers. Our method significantly outperforms the state of the art on both a large data set (CUHK03) and a medium-sized dataset (CUHK01), and it is resistant to overfitting. We also demonstrate that by initially training on an unrelated large data set before fine tuning on a small target data set, our network can achieve results comparable to the state of the art even on the small data set (VIPeR).
Cite
Text
Ahmed et al. "An Improved Deep Learning Architecture for Person Re-Identification." Conference on Computer Vision and Pattern Recognition, 2015. doi:10.1109/CVPR.2015.7299016Markdown
[Ahmed et al. "An Improved Deep Learning Architecture for Person Re-Identification." Conference on Computer Vision and Pattern Recognition, 2015.](https://mlanthology.org/cvpr/2015/ahmed2015cvpr-improved/) doi:10.1109/CVPR.2015.7299016BibTeX
@inproceedings{ahmed2015cvpr-improved,
title = {{An Improved Deep Learning Architecture for Person Re-Identification}},
author = {Ahmed, Ejaz and Jones, Michael and Marks, Tim K.},
booktitle = {Conference on Computer Vision and Pattern Recognition},
year = {2015},
doi = {10.1109/CVPR.2015.7299016},
url = {https://mlanthology.org/cvpr/2015/ahmed2015cvpr-improved/}
}