ChatReID: Open-Ended Interactive Person Retrieval via Hierarchical Progressive Tuning for Vision Language Models

Ke Niu, Haiyang Yu, Mengyang Zhao, Teng Fu, Siyang Yi, Wei Lu, Bin Li, Xuelin Qian, Xiangyang Xue

ICCV 2025 pp. 24245-24254

/iccv/2025/niu2025iccv-chatreid/

Abstract

Person re-identification (Re-ID) is a crucial task in computer vision, aiming to recognize individuals across non-overlapping camera views. While recent advanced vision-language models (VLMs) excel in logical reasoning and multi-task generalization, their applications in Re-ID tasks remain limited. They either struggle to perform accurate matching based on identity-relevant features or assist image-dominated branches as auxiliary semantics. In this paper, we propose a novel framework ChatReID, that shifts the focus towards a text-side-dominated retrieval paradigm, enabling flexible and interactive re-identification. To integrate the reasoning abilities of language models into Re-ID pipelines, We first present a large-scale instruction dataset, which contains more than 8 million prompts to promote the model fine-tuning. Next. we introduce a hierarchical progressive tuning strategy, which endows Re-ID ability through three stages of tuning, i.e., from person attribute understanding to fine-grained image retrieval and to multi-modal task reasoning.Extensive experiments across ten popular benchmarks demonstrate that ChatReID outperforms existing methods, achieving state-of-the-art performance in all Re-ID tasks. More experiments demonstrate that ChatReID not only has the ability to recognize fine-grained details but also to integrate them into a coherent reasoning process.

PDF ICCV Semantic Scholar

Cite

Text

Niu et al. "ChatReID: Open-Ended Interactive Person Retrieval via Hierarchical Progressive Tuning for Vision Language Models." International Conference on Computer Vision, 2025.

Markdown

[Niu et al. "ChatReID: Open-Ended Interactive Person Retrieval via Hierarchical Progressive Tuning for Vision Language Models." International Conference on Computer Vision, 2025.](https://mlanthology.org/iccv/2025/niu2025iccv-chatreid/)

BibTeX

@inproceedings{niu2025iccv-chatreid,
  title     = {{ChatReID: Open-Ended Interactive Person Retrieval via Hierarchical Progressive Tuning for Vision Language Models}},
  author    = {Niu, Ke and Yu, Haiyang and Zhao, Mengyang and Fu, Teng and Yi, Siyang and Lu, Wei and Li, Bin and Qian, Xuelin and Xue, Xiangyang},
  booktitle = {International Conference on Computer Vision},
  year      = {2025},
  pages     = {24245-24254},
  url       = {https://mlanthology.org/iccv/2025/niu2025iccv-chatreid/}
}