Token-Guard: Towards Token-Level Hallucination Control via Self-Checking Decoding

Abstract

Large Language Models (LLMs) often hallucinate, generating content inconsistent with the input. Retrieval-Augmented Generation (RAG) and Reinforcement Learning with Human Feedback (RLHF) can mitigate hallucinations but require resource-intensive retrieval or large-scale fine-tuning. Decoding-based methods are lighter yet lack explicit hallucination control. To address this, we present \textbf{Token-Guard}, a token-level hallucination control method based on self-checking decoding. Token-Guard performs internal verification at each reasoning step to detect hallucinated tokens before they propagate. Candidate fragments are further evaluated in a latent space with explicit hallucination risk scoring, while iterative pruning and regeneration dynamically correct detected errors. Experiments on HALU datasets show Token-Guard substantially reduces hallucinations and improves generation accuracy, offering a scalable, lightweight solution for reliable LLM outputs. Our code is publicly available\footnote{\url{https://github.com/rhq945/Token-Guard}}.

Cite

Text

Zhu et al. "Token-Guard: Towards Token-Level Hallucination Control via Self-Checking Decoding." International Conference on Learning Representations, 2026.

Markdown

[Zhu et al. "Token-Guard: Towards Token-Level Hallucination Control via Self-Checking Decoding." International Conference on Learning Representations, 2026.](https://mlanthology.org/iclr/2026/zhu2026iclr-tokenguard/)

BibTeX

@inproceedings{zhu2026iclr-tokenguard,
  title     = {{Token-Guard: Towards Token-Level Hallucination Control via Self-Checking Decoding}},
  author    = {Zhu, Yifan and Rong, Huiqiang and Luo, Haoran},
  booktitle = {International Conference on Learning Representations},
  year      = {2026},
  url       = {https://mlanthology.org/iclr/2026/zhu2026iclr-tokenguard/}
}