HalluGuard: Demystifying Data-Driven and Reasoning-Driven Hallucinations in LLMs

Zeng, Xinyue; Lin, Junhong; Yan, Yujun; Guo, Feng; Shi, Liang; Wu, Jun; Zhou, Dawei

HalluGuard: Demystifying Data-Driven and Reasoning-Driven Hallucinations in LLMs

Xinyue Zeng, Junhong Lin, Yujun Yan, Feng Guo, Liang Shi, Jun Wu, Dawei Zhou

ICLR 2026

/iclr/2026/zeng2026iclr-halluguard/

Abstract

The reliability of Large Language Models (LLMs) in high-stakes domains such as healthcare, law, and scientific discovery is often compromised by hallucinations. These failures typically stem from two sources: *data-driven hallucinations* and *reasoning-driven hallucinations*. However, existing detection methods usually address only one source and rely on task-specific heuristics, limiting their generalization to complex scenarios. To overcome these limitations, we introduce the *Hallucination Risk Bound*, a unified theoretical framework that formally decomposes hallucination risk into data-driven and reasoning-driven components, linked respectively to training-time mismatches and inference-time instabilities. This provides a principled foundation for analyzing how hallucinations emerge and evolve. Building on this foundation, we introduce **HalluGuard**, an NTK-based score that leverages the induced geometry and captured representations of the NTK to jointly identify data-driven and reasoning-driven hallucinations. We evaluate **HalluGuard** on 10 diverse benchmarks, 11 competitive baselines, and 9 popular LLM backbones, consistently achieving state-of-the-art performance in detecting diverse forms of LLM hallucinations. We open-source our proposed \model{} model at https://github.com/Susan571/HalluGuard-ICLR2026.

PDF ICLR OpenReview Semantic Scholar

Cite

Text

Zeng et al. "HalluGuard: Demystifying Data-Driven and Reasoning-Driven Hallucinations in LLMs." International Conference on Learning Representations, 2026.

Markdown

[Zeng et al. "HalluGuard: Demystifying Data-Driven and Reasoning-Driven Hallucinations in LLMs." International Conference on Learning Representations, 2026.](https://mlanthology.org/iclr/2026/zeng2026iclr-halluguard/)

BibTeX

@inproceedings{zeng2026iclr-halluguard,
  title     = {{HalluGuard: Demystifying Data-Driven and Reasoning-Driven Hallucinations in LLMs}},
  author    = {Zeng, Xinyue and Lin, Junhong and Yan, Yujun and Guo, Feng and Shi, Liang and Wu, Jun and Zhou, Dawei},
  booktitle = {International Conference on Learning Representations},
  year      = {2026},
  url       = {https://mlanthology.org/iclr/2026/zeng2026iclr-halluguard/}
}