Stop Learning It All to Mitigate Visual Hallucination, Focus on the Hallucination Target.

Yoon, Dokyoon; Song, Youngsook; Park, Woomyoung

doi:10.1109/CVPR52734.2025.00397

Stop Learning It All to Mitigate Visual Hallucination, Focus on the Hallucination Target.

Dokyoon Yoon, Youngsook Song, Woomyoung Park

CVPR 2025 pp. 4200-4208

doi:10.1109/CVPR52734.2025.00397 /cvpr/2025/yoon2025cvpr-stop/

Abstract

Multimodal Large Language Models (MLLMs) frequently suffer from hallucination issues, generating information about objects that are not present in input images during vision-language tasks. These hallucinations particularly undermine model reliability in practical applications requiring accurate object identification. To address this challenge, we propose TL-DPO, a preference learning approach that mitigates hallucinations by focusing on targeted areas where they occur. To implement this, we build a dataset containing hallucinated responses, correct responses, and target information (i.e., objects present in the images and the corresponding chunk positions in responses affected by hallucinations). By applying a preference learning method restricted to these specific targets, the model can filter out irrelevant signals and focus on correcting hallucinations. This allows the model to produce more factual responses by concentrating solely on relevant information. Experimental results demonstrate that TL-DPO effectively reduces hallucinations across multiple vision hallucination tasks, improving the reliability and performance of MLLMs without diminishing overall performance.

PDF CVPR Semantic Scholar

Cite

Text

Yoon et al. "Stop Learning It All to Mitigate Visual Hallucination, Focus on the Hallucination Target.." Conference on Computer Vision and Pattern Recognition, 2025. doi:10.1109/CVPR52734.2025.00397

Markdown

[Yoon et al. "Stop Learning It All to Mitigate Visual Hallucination, Focus on the Hallucination Target.." Conference on Computer Vision and Pattern Recognition, 2025.](https://mlanthology.org/cvpr/2025/yoon2025cvpr-stop/) doi:10.1109/CVPR52734.2025.00397

BibTeX

@inproceedings{yoon2025cvpr-stop,
  title     = {{Stop Learning It All to Mitigate Visual Hallucination, Focus on the Hallucination Target.}},
  author    = {Yoon, Dokyoon and Song, Youngsook and Park, Woomyoung},
  booktitle = {Conference on Computer Vision and Pattern Recognition},
  year      = {2025},
  pages     = {4200-4208},
  doi       = {10.1109/CVPR52734.2025.00397},
  url       = {https://mlanthology.org/cvpr/2025/yoon2025cvpr-stop/}
}