Categorical Attention: Fine-Grained Language-Guided Noise Filtering Network for Occluded Person Re-Identification

Abstract

Person Re-Identification (ReID) aims to match individuals across different camera views, but occlusions in real-world scenarios, such as vehicles or crowds, hinder feature extraction and matching. Current occluded ReID methodologies typically leverage visual augmentation techniques in an attempt to mitigate the disruptive effects of occlusion-induced noise. However, relying solely on visual data fail to effectively filter out occlusion noise. In this paper, we introduce the Fine-grained Language-guided Noise Filtering Network (FLaN-Net) for occluded ReID. FLaN-Net innovatively employs categorical attention mechanism to generate adaptive tokens that capture the following three distinct types of visual information: comprehensive descriptions of individuals, detailed visible attributes, and characteristics of occluding objects. Subsequently, a cross-attention mechanism aligns these prompts with the image, guiding the model to focus on relevant regions. To generate robust and discriminative features for occluded pedestrians, we further introduce a dynamic weighting fusion module that integrates visual, textual, and cross-attention features based on their reliability. Experimental results demonstrate that FLaN-Net outperforms existing methods on occluded ReID benchmarks, offering a robust solution for challenging real-world conditions.

Cite

Text

Chen et al. "Categorical Attention: Fine-Grained Language-Guided Noise Filtering Network for Occluded Person Re-Identification." International Joint Conference on Artificial Intelligence, 2025. doi:10.24963/IJCAI.2025/90

Markdown

[Chen et al. "Categorical Attention: Fine-Grained Language-Guided Noise Filtering Network for Occluded Person Re-Identification." International Joint Conference on Artificial Intelligence, 2025.](https://mlanthology.org/ijcai/2025/chen2025ijcai-categorical/) doi:10.24963/IJCAI.2025/90

BibTeX

@inproceedings{chen2025ijcai-categorical,
  title     = {{Categorical Attention: Fine-Grained Language-Guided Noise Filtering Network for Occluded Person Re-Identification}},
  author    = {Chen, Minghui and Wu, Dayan and Yang, Chenxu and Su, Qinghang and Lin, Zheng},
  booktitle = {International Joint Conference on Artificial Intelligence},
  year      = {2025},
  pages     = {801-809},
  doi       = {10.24963/IJCAI.2025/90},
  url       = {https://mlanthology.org/ijcai/2025/chen2025ijcai-categorical/}
}