Are Large Language Models Chameleons?

Abstract

Do large language models (LLMs) have their own worldviews and personality tendencies? Simulations in which an LLM was asked to answer subjective questions were conducted more than 1 million times. Comparison of the responses from different LLMs with real data from the European Social Survey (ESS) suggests that the effect of prompts on bias and variability is fundamental, highlighting major cultural, age, and gender biases. Methods for measuring the difference between LLMs and survey data are discussed, such as calculating weighted means and a new proposed measure inspired by Jaccard similarity. We conclude that it is important to analyze the robustness and variability of prompts before using LLMs to model individual decisions or collective behavior, as their imitation abilities are approximate at best.

Cite

Text

Geng et al. "Are Large Language Models Chameleons?." ICML 2024 Workshops: LLMs_and_Cognition, 2024.

Markdown

[Geng et al. "Are Large Language Models Chameleons?." ICML 2024 Workshops: LLMs_and_Cognition, 2024.](https://mlanthology.org/icmlw/2024/geng2024icmlw-large/)

BibTeX

@inproceedings{geng2024icmlw-large,
  title     = {{Are Large Language Models Chameleons?}},
  author    = {Geng, Mingmeng and He, Sihong and Trotta, Roberto},
  booktitle = {ICML 2024 Workshops: LLMs_and_Cognition},
  year      = {2024},
  url       = {https://mlanthology.org/icmlw/2024/geng2024icmlw-large/}
}