On the Thinking-Language Modeling Gap in Large Language Models

Abstract

Large Language Models (LLMs) demonstrate remarkable capabilities in solving complicated reasoning tasks by imitating the human thinking process from human languages. However, even the most capable LLMs can still fail in tasks that are simple for humans. To understand the gap, we construct structural causal models of next-token predictors in human languages. As language is primarily a tool for humans to share knowledge instead of thinking, modeling human thinking from languages can integrate language expression biases into LLMs. More specifically, we show that LLMs can fail to understand implicit expressions -- expression patterns occur less frequently during training. Consequently, LLMs can easily overlook critical information when biased by implicit expressions. We verify our theoretical claims with carefully constructed realistic datasets containing implicit expressions. Furthermore, we also propose a prompt-level intervention to instruct LLMs to carefully expand and focus on all the expressions available. The empirical success of the prompt-level intervention across 11 tasks and 4 representative LLMs, along with the improvements over general reasoning tasks, reaffirms our findings. Our code is publicly available at the project website: https://causalcoat.github.io/lot

Cite

Text

Liu et al. "On the Thinking-Language Modeling Gap in Large Language Models." International Conference on Learning Representations, 2026.

Markdown

[Liu et al. "On the Thinking-Language Modeling Gap in Large Language Models." International Conference on Learning Representations, 2026.](https://mlanthology.org/iclr/2026/liu2026iclr-thinkinglanguage/)

BibTeX

@inproceedings{liu2026iclr-thinkinglanguage,
  title     = {{On the Thinking-Language Modeling Gap in Large Language Models}},
  author    = {Liu, Chenxi and Chen, Yongqiang and Liu, Tongliang and Cheng, James and Han, Bo and Zhang, Kun},
  booktitle = {International Conference on Learning Representations},
  year      = {2026},
  url       = {https://mlanthology.org/iclr/2026/liu2026iclr-thinkinglanguage/}
}