HalluGuard: Demystifying Data-Driven and Reasoning-Driven Hallucinations in LLMs
Abstract
The reliability of Large Language Models (LLMs) in high-stakes domains such as healthcare, law, and scientific discovery is often compromised by hallucinations. These failures typically stem from two sources: *data-driven hallucinations* and *reasoning-driven hallucinations*. However, existing detection methods usually address only one source and rely on task-specific heuristics, limiting their generalization to complex scenarios. To overcome these limitations, we introduce the *Hallucination Risk Bound*, a unified theoretical framework that formally decomposes hallucination risk into data-driven and reasoning-driven components, linked respectively to training-time mismatches and inference-time instabilities. This provides a principled foundation for analyzing how hallucinations emerge and evolve. Building on this foundation, we introduce **HalluGuard**, an NTK-based score that leverages the induced geometry and captured representations of the NTK to jointly identify data-driven and reasoning-driven hallucinations. We evaluate **HalluGuard** on 10 diverse benchmarks, 11 competitive baselines, and 9 popular LLM backbones, consistently achieving state-of-the-art performance in detecting diverse forms of LLM hallucinations. We open-source our proposed \model{} model at https://github.com/Susan571/HalluGuard-ICLR2026.
Cite
Text
Zeng et al. "HalluGuard: Demystifying Data-Driven and Reasoning-Driven Hallucinations in LLMs." International Conference on Learning Representations, 2026.Markdown
[Zeng et al. "HalluGuard: Demystifying Data-Driven and Reasoning-Driven Hallucinations in LLMs." International Conference on Learning Representations, 2026.](https://mlanthology.org/iclr/2026/zeng2026iclr-halluguard/)BibTeX
@inproceedings{zeng2026iclr-halluguard,
title = {{HalluGuard: Demystifying Data-Driven and Reasoning-Driven Hallucinations in LLMs}},
author = {Zeng, Xinyue and Lin, Junhong and Yan, Yujun and Guo, Feng and Shi, Liang and Wu, Jun and Zhou, Dawei},
booktitle = {International Conference on Learning Representations},
year = {2026},
url = {https://mlanthology.org/iclr/2026/zeng2026iclr-halluguard/}
}