Attributive Reasoning for Hallucination Diagnosis of Large Language Models

Abstract

In recent years, large language models (LLMs) have demonstrated outstanding capabilities in various tasks. However, LLMs also have various drawbacks, especially hallucination. Hallucination refers to the generation of content that does not align with the user input, contradicts previously generated content or world knowledge. Current research on hallucination mainly include knowledge retrieval, prompt engineering, training data improvement, reinforcement learning, etc. However, these methods do not involve different categories of hallucinations which is important on hallucination analysis, and make detailed investigation for the internal state of LLMs which indicates the direction on hallucination occurrence. Therefore, in our research, we introduce an attribution framework to trace the origins of hallucinations based on the internal signals of LLMs. To support this framework, we develop a new benchmark named RelQA-Cate, which includes eight categories of hallucinations for the answers generated by LLMs. After that, we present a novel Differential Penalty Decoding (DPD) strategy for reducing hallucinations through adjusting post-probabilities of each answer. We conduct a series of experiments and the performance on answer reliability has significant improvement, achieving 28.25% at most, which demonstrates the effectiveness of our proposed DPD and its generalization in mitigating hallucination in LLMs.

Cite

Text

Chen et al. "Attributive Reasoning for Hallucination Diagnosis of Large Language Models." AAAI Conference on Artificial Intelligence, 2025. doi:10.1609/AAAI.V39I22.34536

Markdown

[Chen et al. "Attributive Reasoning for Hallucination Diagnosis of Large Language Models." AAAI Conference on Artificial Intelligence, 2025.](https://mlanthology.org/aaai/2025/chen2025aaai-attributive/) doi:10.1609/AAAI.V39I22.34536

BibTeX

@inproceedings{chen2025aaai-attributive,
  title     = {{Attributive Reasoning for Hallucination Diagnosis of Large Language Models}},
  author    = {Chen, Yuyan and Li, Zehao and You, Shuangjie and Chen, Zhengyu and Chang, Jingwen and Zhang, Yi and Dai, Weinan and Guo, Qingpei and Xiao, Yanghua},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {2025},
  pages     = {23660-23668},
  doi       = {10.1609/AAAI.V39I22.34536},
  url       = {https://mlanthology.org/aaai/2025/chen2025aaai-attributive/}
}