Text-Based Person Search via Attribute-Aided Matching
Abstract
Text-based person search aims to retrieve the pedestrian images that best match a given text query. Existing methods utilize class-id information to get discriminative and identity-preserving features. However, it is not well-explored whether it is beneficial to explicitly ensure that the semantics of the data are retained. In the proposed work, we aim to create semantics-preserving embeddings through an additional task of attribute prediction. Since attribute annotation is typically unavailable in text-based person search, we first mine them from the text corpus. These attributes are then used as a means to bridge the modality gap between the image-text inputs, as well as to improve the representation learning. In summary, we propose an approach for text-based person search by learning an attribute-driven space along with a class-information driven space, and utilize both for obtaining the retrieval results. Our experiments on benchmark dataset, CUHK-PEDES, show that learning the attribute-space not only helps in improving performance, giving us state-of-the-art Rank-1 accuracy of 56.68%, but also yields humanly-interpretable features.
Cite
Text
Aggarwal et al. "Text-Based Person Search via Attribute-Aided Matching." Winter Conference on Applications of Computer Vision, 2020.Markdown
[Aggarwal et al. "Text-Based Person Search via Attribute-Aided Matching." Winter Conference on Applications of Computer Vision, 2020.](https://mlanthology.org/wacv/2020/aggarwal2020wacv-textbased/)BibTeX
@inproceedings{aggarwal2020wacv-textbased,
title = {{Text-Based Person Search via Attribute-Aided Matching}},
author = {Aggarwal, Surbhi and Radhakrishnan, Venkatesh Babu and Chakraborty, Anirban},
booktitle = {Winter Conference on Applications of Computer Vision},
year = {2020},
url = {https://mlanthology.org/wacv/2020/aggarwal2020wacv-textbased/}
}