Between Prudence and Paranoia: Theory of Mind Gone Right, and Wrong
Abstract
Agents need to be on their toes when interacting with competitive others to avoid being duped. Too much vigilance out of context can, however, be detrimental and produce paranoia. Here, we offer a formal account of this phenomenon through the lens of theory of mind. We simulate agents of different depths of mentalization and show how, if aligned well, deep recursive mentalisation gives rise to both successful deception as well as reasonable skepticism. However, we also show how, if theory of mind is too sophisticated, agents become paranoid, losing trust and reward in the process. We discuss our findings in light of computational psychiatry and AI safety.
Cite
Text
Alon et al. "Between Prudence and Paranoia: Theory of Mind Gone Right, and Wrong." ICML 2023 Workshops: ToM, 2023.Markdown
[Alon et al. "Between Prudence and Paranoia: Theory of Mind Gone Right, and Wrong." ICML 2023 Workshops: ToM, 2023.](https://mlanthology.org/icmlw/2023/alon2023icmlw-prudence/)BibTeX
@inproceedings{alon2023icmlw-prudence,
title = {{Between Prudence and Paranoia: Theory of Mind Gone Right, and Wrong}},
author = {Alon, Nitay and Schulz, Lion and Dayan, Peter and Barnby, Joseph M},
booktitle = {ICML 2023 Workshops: ToM},
year = {2023},
url = {https://mlanthology.org/icmlw/2023/alon2023icmlw-prudence/}
}