EPITOME: Experimental Protocol Inventory for Theory of Mind Evaluation

Jones, Cameron Robert; Trott, Sean; Bergen, Ben

EPITOME: Experimental Protocol Inventory for Theory of Mind Evaluation

Cameron Robert Jones, Sean Trott, Ben Bergen

ICMLW 2023

/icmlw/2023/jones2023icmlw-epitome/

Abstract

We address a growing debate about the extent to which large language models (LLMs) produce behavior consistent with Theory of Mind (ToM) in humans. We present EPITOME: a battery of six experiments that tap diverse ToM capacities, including belief attribution, emotional inference, and pragmatic reasoning. We compare performance of five LLMs to a baseline of responses from human comprehenders. Results are mixed. LLMs display considerable sensitivity to mental states and match human performance in several tasks. Yet, they commit systematic errors in others, especially those requiring pragmatic reasoning on the basis of mental state information. Such uneven performance indicates that attributing ToM to LLMs might be premature.

PDF ICMLW OpenReview Semantic Scholar

Cite

Text

Jones et al. "EPITOME: Experimental Protocol Inventory for Theory of Mind Evaluation." ICML 2023 Workshops: ToM, 2023.

Markdown

[Jones et al. "EPITOME: Experimental Protocol Inventory for Theory of Mind Evaluation." ICML 2023 Workshops: ToM, 2023.](https://mlanthology.org/icmlw/2023/jones2023icmlw-epitome/)

BibTeX

@inproceedings{jones2023icmlw-epitome,
  title     = {{EPITOME: Experimental Protocol Inventory for Theory of Mind Evaluation}},
  author    = {Jones, Cameron Robert and Trott, Sean and Bergen, Ben},
  booktitle = {ICML 2023 Workshops: ToM},
  year      = {2023},
  url       = {https://mlanthology.org/icmlw/2023/jones2023icmlw-epitome/}
}