SHIFT: Smoothing Hallucinations by Information Flow Tuning for Multimodal Large Language Models
Abstract
Large Language Models (LLMs) are prone to hallucinations, which pose significant risks in their applications. Most existing hallucination detection methods rely on internal probabilities or external knowledge, and they are limited to identifying hallucinations at the sentence or passage level. In this paper, we introduce the first token-level, zero-resource hallucination detection framework, leveraging a novel approach inspired by the Mad Libs game. This method assesses the reliability of the input text by evaluating the consistency of information before and after the game. Building on this framework, we also propose an innovative automated hallucination generation technique and introduce a high-quality hallucination dataset, HalluWiki. Extensive experiments demonstrate that our approach achieves over 90% detection accuracy across different levels, establishing a new frontier in hallucination detection for LLMs.
Cite
Text
Wang et al. "SHIFT: Smoothing Hallucinations by Information Flow Tuning for Multimodal Large Language Models." International Conference on Computer Vision, 2025.Markdown
[Wang et al. "SHIFT: Smoothing Hallucinations by Information Flow Tuning for Multimodal Large Language Models." International Conference on Computer Vision, 2025.](https://mlanthology.org/iccv/2025/wang2025iccv-shift/)BibTeX
@inproceedings{wang2025iccv-shift,
title = {{SHIFT: Smoothing Hallucinations by Information Flow Tuning for Multimodal Large Language Models}},
author = {Wang, Sudong and Zhang, Yunjian and Zhu, Yao and Liu, Enci and Li, Jianing and Liu, Yanwei and Ji, Xiangyang},
booktitle = {International Conference on Computer Vision},
year = {2025},
pages = {3639-3649},
url = {https://mlanthology.org/iccv/2025/wang2025iccv-shift/}
}