Logical Consistency of Large Language Models in Fact-Checking

Abstract

In recent years, large language models (LLMs) have demonstrated significant success in performing varied natural language tasks such as language translation, question-answering, summarizing, fact-checking, etc. Despite LLMs’ impressive ability to generate human-like texts, LLMs are infamous for their inconsistent responses – a meaning-preserving change in the input query results in an inconsistent response and attributes to vulnerabilities of LLMs such as hallucination. Consequently, existing research focuses on simple paraphrasing-based consistency assessment of LLMs, and ignores complex queries that necessitate an even better understanding of logical reasoning by an LLM. *Our work therefore addresses the logical inconsistency of LLMs under complex logical queries with primitive logical operators, e.g., negation, conjunction, and disjunction.* As a test bed, we consider retrieval-augmented LLMs on a fact-checking task involving propositional logic queries from knowledge graphs (KGs). Our contributions are three-fold. **Benchmark:** We introduce three logical fact-checking datasets over KGs for community development towards logically consistent LLMs. **Assessment:** We propose consistency measures of LLMs on propositional logic queries and demonstrate that existing LLMs lack logical consistency, especially on complex queries. **Improvement:** We employ supervised fine-tuning to improve the logical consistency of LLMs on the complex fact-checking task with KG contexts. We have made our source code and benchmarks available.

Cite

Text

Ghosh et al. "Logical Consistency of Large Language Models in Fact-Checking." International Conference on Learning Representations, 2025.

Markdown

[Ghosh et al. "Logical Consistency of Large Language Models in Fact-Checking." International Conference on Learning Representations, 2025.](https://mlanthology.org/iclr/2025/ghosh2025iclr-logical/)

BibTeX

@inproceedings{ghosh2025iclr-logical,
  title     = {{Logical Consistency of Large Language Models in Fact-Checking}},
  author    = {Ghosh, Bishwamittra and Hasan, Sarah and Arafat, Naheed Anjum and Khan, Arijit},
  booktitle = {International Conference on Learning Representations},
  year      = {2025},
  url       = {https://mlanthology.org/iclr/2025/ghosh2025iclr-logical/}
}