Multi-Turn Evaluation of Anthropomorphic Behaviours in Large Language Models

Ibrahim, Lujain; Akbulut, Canfer; Elasmar, Rasmi; Rastogi, Charvi; Kahng, Minsuk; Morris, Meredith Ringel; McKee, Kevin R.; Rieser, Verena; Shanahan, Murray; Weidinger, Laura

Multi-Turn Evaluation of Anthropomorphic Behaviours in Large Language Models

Lujain Ibrahim, Canfer Akbulut, Rasmi Elasmar, Charvi Rastogi, Minsuk Kahng, Meredith Ringel Morris, Kevin R. McKee, Verena Rieser, Murray Shanahan, Laura Weidinger

ICLR 2026

/iclr/2026/ibrahim2026iclr-multiturn/

Abstract

The tendency of users to anthropomorphise large language models (LLMs) is of growing societal interest. Here, we present AnthroBench: a novel empirical method and tool for evaluating anthropomorphic LLM behaviours in realistic settings. Our work introduces three key advances; first, we develop a multi-turn evaluation of 14 distinct anthropomorphic behaviours, moving beyond single-turn assessment. Second, we present a scalable, automated approach by leveraging simulations of user interactions, enabling efficient and reproducible assessment. Third, we conduct an interactive, large-scale human subject study (N=1101) to empirically validate that the model behaviours we measure predict real users’ anthropomorphic perceptions. We find that all evaluated LLMs exhibit similar behaviours, primarily characterised by relationship-building (e.g., empathy and validation) and first-person pronoun use. Crucially, we observe that the majority of these anthropomorphic behaviors only first occur after multiple turns, underscoring the necessity of multi-turn evaluations for understanding complex social phenomena in human-AI interaction. Our work provides a robust empirical foundation for investigating how design choices influence anthropomorphic model behaviours and for progressing the ethical debate on the desirability of these behaviours.

PDF ICLR OpenReview Semantic Scholar

Cite

Text

Ibrahim et al. "Multi-Turn Evaluation of Anthropomorphic Behaviours in Large Language Models." International Conference on Learning Representations, 2026.

Markdown

[Ibrahim et al. "Multi-Turn Evaluation of Anthropomorphic Behaviours in Large Language Models." International Conference on Learning Representations, 2026.](https://mlanthology.org/iclr/2026/ibrahim2026iclr-multiturn/)

BibTeX

@inproceedings{ibrahim2026iclr-multiturn,
  title     = {{Multi-Turn Evaluation of Anthropomorphic Behaviours in Large Language Models}},
  author    = {Ibrahim, Lujain and Akbulut, Canfer and Elasmar, Rasmi and Rastogi, Charvi and Kahng, Minsuk and Morris, Meredith Ringel and McKee, Kevin R. and Rieser, Verena and Shanahan, Murray and Weidinger, Laura},
  booktitle = {International Conference on Learning Representations},
  year      = {2026},
  url       = {https://mlanthology.org/iclr/2026/ibrahim2026iclr-multiturn/}
}