Screening, Rectifying, and Re-Screening: A Unified Framework for Tuning Vision-Language Models with Noisy Labels

Chaowei Fang, Hangfei Ma, Zhihao Li, De Cheng, Yue Zhang, Guanbin Li

IJCAI 2025 pp. 5101-5109

doi:10.24963/IJCAI.2025/568 /ijcai/2025/fang2025ijcai-screening/

Abstract

Pre-trained vision-language models have shown remarkable potential for downstream tasks. However, their fine-tuning under noisy labels remains an open problem due to challenges like self-confirmation bias and the limitations of conventional small-loss criteria. In this paper, we propose a unified framework to address these issues, consisting of three key steps: Screening, Rectifying, and Re-Screening. First, a dual-level semantic matching mechanism is introduced to categorize samples into clean, ambiguous, and noisy samples by leveraging both macro-level and micro-level textual prompts. Second, we design tailored pseudo-labeling strategies to rectify noisy and ambiguous labels, enabling their effective incorporation into the training process. Finally, a re-screening step, utilizing cross-validation with an auxiliary vision-language model, mitigates self-confirmation bias and enhances the robustness of the framework. Extensive experiments across ten datasets demonstrate that the proposed method significantly outperforms existing approaches for tuning vision-language pre-trained models with noisy labels.

PDF IJCAI Semantic Scholar

Cite

Text

Fang et al. "Screening, Rectifying, and Re-Screening: A Unified Framework for Tuning Vision-Language Models with Noisy Labels." International Joint Conference on Artificial Intelligence, 2025. doi:10.24963/IJCAI.2025/568

Markdown

[Fang et al. "Screening, Rectifying, and Re-Screening: A Unified Framework for Tuning Vision-Language Models with Noisy Labels." International Joint Conference on Artificial Intelligence, 2025.](https://mlanthology.org/ijcai/2025/fang2025ijcai-screening/) doi:10.24963/IJCAI.2025/568

BibTeX

@inproceedings{fang2025ijcai-screening,
  title     = {{Screening, Rectifying, and Re-Screening: A Unified Framework for Tuning Vision-Language Models with Noisy Labels}},
  author    = {Fang, Chaowei and Ma, Hangfei and Li, Zhihao and Cheng, De and Zhang, Yue and Li, Guanbin},
  booktitle = {International Joint Conference on Artificial Intelligence},
  year      = {2025},
  pages     = {5101-5109},
  doi       = {10.24963/IJCAI.2025/568},
  url       = {https://mlanthology.org/ijcai/2025/fang2025ijcai-screening/}
}