In-Context Learning May Not Elicit Trustworthy Reasoning: A-Not-B Errors in Pretrained Language Models

Abstract

Recent advancements in artificial intelligence (AI) have led to the development of highly capable large language models (LLMs) demonstrating significant human-like abilities. Yet these pretrained LLMs are often vulnerable to interesting cognitive biases. In this work, we study the A-Not-B error -- a developmental stage for human infants, characterized by the persistence of previously rewarded behavior despite changed conditions that warrant even trivial adaptation. Our investigation reveals that LLMs, akin to human infants, erroneously apply past successful responses to slightly altered contexts. Employing various reasoning tasks, we demonstrate that LLMs are susceptible to the A-Not-B error. Notably, smaller models exhibit heightened vulnerability, mirroring the developmental trajectory of human infants. Models pretrained with extensive, high-quality data show significant resilience, highlighting the importance of internal knowledge quality, similar to how rich experiences enhance human cognitive abilities. Furthermore, increasing the number of examples before a context change leads to more pronounced failures, highlighting that LLMs are fundamentally pattern-driven and may falter with minor, non-erroneous changes merely in patterns. We open source all code and results under a permissive MIT license, to encourage reproduction and further research exploration.

Cite

Text

Han et al. "In-Context Learning May Not Elicit Trustworthy Reasoning: A-Not-B Errors in Pretrained Language Models." ICML 2024 Workshops: LLMs_and_Cognition, 2024.

Markdown

[Han et al. "In-Context Learning May Not Elicit Trustworthy Reasoning: A-Not-B Errors in Pretrained Language Models." ICML 2024 Workshops: LLMs_and_Cognition, 2024.](https://mlanthology.org/icmlw/2024/han2024icmlw-incontext/)

BibTeX

@inproceedings{han2024icmlw-incontext,
  title     = {{In-Context Learning May Not Elicit Trustworthy Reasoning: A-Not-B Errors in Pretrained Language Models}},
  author    = {Han, Pengrui and Song, Peiyang and Yu, Haofei and You, Jiaxuan},
  booktitle = {ICML 2024 Workshops: LLMs_and_Cognition},
  year      = {2024},
  url       = {https://mlanthology.org/icmlw/2024/han2024icmlw-incontext/}
}