CogniLoad: A Synthetic Natural Language Reasoning Benchmark with Tunable Length, Intrinsic Difficulty, and Distractor Density

Kaiser, Daniel; Frigessi, Arnoldo; Ramezani-Kebrya, Ali; Ricaud, Benjamin

CogniLoad: A Synthetic Natural Language Reasoning Benchmark with Tunable Length, Intrinsic Difficulty, and Distractor Density

Daniel Kaiser, Arnoldo Frigessi, Ali Ramezani-Kebrya, Benjamin Ricaud

ICLR 2026

/iclr/2026/kaiser2026iclr-cogniload/

Abstract

Current benchmarks for long-context reasoning in Large Language Models (LLMs) often blur critical factors like intrinsic task complexity, distractor interference, and task length. To enable more precise failure analysis, we introduce CogniLoad, a novel synthetic benchmark grounded in Cognitive Load Theory (CLT). CogniLoad generates natural-language logic puzzles with independently tunable parameters that reflect CLT's core dimensions: intrinsic difficulty ($d$) controls intrinsic load; distractor-to-signal ratio ($\rho$) regulates extraneous load; and task length ($N$) serves as an operational proxy for conditions demanding germane load. Evaluating 22 SotA reasoning LLMs, CogniLoad reveals distinct performance sensitivities, identifying task length as a dominant constraint and uncovering varied tolerances to intrinsic complexity and U-shaped responses to distractor ratios. By offering systematic, factorial control over these cognitive load dimensions, CogniLoad provides a reproducible, scalable, and diagnostically rich tool for dissecting LLM reasoning limitations and guiding future model development.

PDF ICLR OpenReview Semantic Scholar

Cite

Text

Kaiser et al. "CogniLoad: A Synthetic Natural Language Reasoning Benchmark with Tunable Length, Intrinsic Difficulty, and Distractor Density." International Conference on Learning Representations, 2026.

Markdown

[Kaiser et al. "CogniLoad: A Synthetic Natural Language Reasoning Benchmark with Tunable Length, Intrinsic Difficulty, and Distractor Density." International Conference on Learning Representations, 2026.](https://mlanthology.org/iclr/2026/kaiser2026iclr-cogniload/)

BibTeX

@inproceedings{kaiser2026iclr-cogniload,
  title     = {{CogniLoad: A Synthetic Natural Language Reasoning Benchmark with Tunable Length, Intrinsic Difficulty, and Distractor Density}},
  author    = {Kaiser, Daniel and Frigessi, Arnoldo and Ramezani-Kebrya, Ali and Ricaud, Benjamin},
  booktitle = {International Conference on Learning Representations},
  year      = {2026},
  url       = {https://mlanthology.org/iclr/2026/kaiser2026iclr-cogniload/}
}