Abstract Understanding of Core-Knowledge Concepts: Humans vs. LLMs

Abstract

The ability to form and use abstractions in a few-shot manner is a key aspect of human cognition; it is this capacity that enables us to understand and act appropriately in novel situations. In this paper we report on comparisons between humans and GPT-4V on visual tasks designed to systematically assess few-shot abstraction capabilities using core-knowledge concepts related to objectness, object motion, spatial configurations and relationships, and basic numerosity. We test the impact of presenting tasks to GPT-4V using visual, mixed text-visual, and text-only representations. Our findings highlight that GPT-4V, one of today's most advanced multimodal LLMs, still lacks the flexible intelligence possessed by humans to efficiently relate different situations through novel abstractions.

Cite

Text

Palmarini and Mitchell. "Abstract Understanding of Core-Knowledge Concepts: Humans vs. LLMs." ICML 2024 Workshops: LLMs_and_Cognition, 2024.

Markdown

[Palmarini and Mitchell. "Abstract Understanding of Core-Knowledge Concepts: Humans vs. LLMs." ICML 2024 Workshops: LLMs_and_Cognition, 2024.](https://mlanthology.org/icmlw/2024/palmarini2024icmlw-abstract/)

BibTeX

@inproceedings{palmarini2024icmlw-abstract,
  title     = {{Abstract Understanding of Core-Knowledge Concepts: Humans vs. LLMs}},
  author    = {Palmarini, Alessandro B. and Mitchell, Melanie},
  booktitle = {ICML 2024 Workshops: LLMs_and_Cognition},
  year      = {2024},
  url       = {https://mlanthology.org/icmlw/2024/palmarini2024icmlw-abstract/}
}