LogiConBench: Benchmarking Logical Consistencies of LLMs

Chen, Zheng; Zhou, Chuan; Cheng, Fengxiang; Po, Yip Tin; Liu, Fenrong; Wang, Yisen; Chai, Jiajun; Wang, Xiaohan; Yin, Guojun; Lin, Wei; Li, Bo; Li, Haoxuan; Lin, Zhouchen

LogiConBench: Benchmarking Logical Consistencies of LLMs

Zheng Chen, Chuan Zhou, Fengxiang Cheng, Yip Tin Po, Fenrong Liu, Yisen Wang, Jiajun Chai, Xiaohan Wang, Guojun Yin, Wei Lin, Bo Li, Haoxuan Li, Zhouchen Lin

ICLR 2026

/iclr/2026/chen2026iclr-logiconbench/

Abstract

Logical consistency, the requirement that statements remain non-contradictory under logical rules, is fundamental for trustworthy reasoning, yet current LLMs often fail to maintain it even on simple inference tasks. Existing benchmarks for LLM logical consistency are not scalable, not diverse, and not challenging, with state-of-the-art models already surpassing 95\% accuracy. LogiConBench is the first benchmark that (1) generates unlimited logical rule combinations with precise labels, (2) provides controllable-depth graphs with explicit reasoning paths, and (3) remains challenging for state-of-the-art LLMs. To achieve this, LogiConBench automatically generates logical graphs where nodes represent symbolic propositions and edges denote reasoning relations. From these graphs, it samples lists of propositions, extracts reasoning paths, determines all consistent label lists, and translates them into diverse natural language expressions. While we release a 280K-sample corpus in this work, the framework can be scaled to generate unlimited data. To strengthen its evaluative significance, we evaluate 14 frontier LLMs on three tasks with varying difficulty levels, and find that the Enumerative task remains extremely challenging, with the best exact accuracy as only 34\%. Our code and data are available at https://github.com/Bellafc/LogiConBench.git.

PDF ICLR OpenReview Semantic Scholar

Cite

Text

Chen et al. "LogiConBench: Benchmarking Logical Consistencies of LLMs." International Conference on Learning Representations, 2026.

Markdown

[Chen et al. "LogiConBench: Benchmarking Logical Consistencies of LLMs." International Conference on Learning Representations, 2026.](https://mlanthology.org/iclr/2026/chen2026iclr-logiconbench/)

BibTeX

@inproceedings{chen2026iclr-logiconbench,
  title     = {{LogiConBench: Benchmarking Logical Consistencies of LLMs}},
  author    = {Chen, Zheng and Zhou, Chuan and Cheng, Fengxiang and Po, Yip Tin and Liu, Fenrong and Wang, Yisen and Chai, Jiajun and Wang, Xiaohan and Yin, Guojun and Lin, Wei and Li, Bo and Li, Haoxuan and Lin, Zhouchen},
  booktitle = {International Conference on Learning Representations},
  year      = {2026},
  url       = {https://mlanthology.org/iclr/2026/chen2026iclr-logiconbench/}
}