SOTOPIA: Interactive Evaluation for Social Intelligence in Language Agents

Abstract

*Humans are social beings*; we pursue social goals in our daily interactions, which is a crucial aspect of social intelligence. Yet, AI systems' abilities in this realm remain elusive. We present SOTOPIA, an open-ended environment to simulate complex social interactions between artificial agents and evaluate their social intelligence. In our environment, agents role-play and *interact* under a wide variety of scenarios; they coordinate, collaborate, exchange, and compete with each other to achieve complex social goals. We simulate the role-play interaction between LLM-based agents and humans within this task space and evaluate their performance with a holistic evaluation framework called SOTOPIA-Eval. With SOTOPIA, we find significant differences between these models in terms of their social intelligence, and we identify a subset of SOTOPIA scenarios, SOTOPIA-hard, that is generally challenging for all models. We find that on this subset, GPT-4 achieves a significantly lower goal completion rate than humans and struggles to exhibit social commonsense reasoning and strategic communication skills. These findings demonstrate SOTOPIA's promise as a general platform for research on evaluating and improving social intelligence in artificial agents.

Cite

Text

Zhou et al. "SOTOPIA: Interactive Evaluation for Social Intelligence in Language Agents." International Conference on Learning Representations, 2024.

Markdown

[Zhou et al. "SOTOPIA: Interactive Evaluation for Social Intelligence in Language Agents." International Conference on Learning Representations, 2024.](https://mlanthology.org/iclr/2024/zhou2024iclr-sotopia/)

BibTeX

@inproceedings{zhou2024iclr-sotopia,
  title     = {{SOTOPIA: Interactive Evaluation for Social Intelligence in Language Agents}},
  author    = {Zhou, Xuhui and Zhu, Hao and Mathur, Leena and Zhang, Ruohong and Yu, Haofei and Qi, Zhengyang and Morency, Louis-Philippe and Bisk, Yonatan and Fried, Daniel and Neubig, Graham and Sap, Maarten},
  booktitle = {International Conference on Learning Representations},
  year      = {2024},
  url       = {https://mlanthology.org/iclr/2024/zhou2024iclr-sotopia/}
}