Epistemic Side Effects & Avoiding Them (Sometimes)

Abstract

AI safety research has investigated the problem of negative side effects -- undesirable changes made by AI systems in pursuit of an underspecified objective. However, the focus has been on physical side effects, such as a robot breaking a vase while moving. In this paper we introduce the notion of epistemic side effects, unintended changes made to the knowledge or beliefs of agents, and describe a way to avoid negative epistemic side effects in reinforcement learning, in some cases.

PDF NeurIPSW OpenReview Semantic Scholar

Cite

Text

Klassen et al. "Epistemic Side Effects & Avoiding Them (Sometimes)." NeurIPS 2022 Workshops: MLSW, 2022.

Markdown

[Klassen et al. "Epistemic Side Effects & Avoiding Them (Sometimes)." NeurIPS 2022 Workshops: MLSW, 2022.](https://mlanthology.org/neuripsw/2022/klassen2022neuripsw-epistemic/)

BibTeX

@inproceedings{klassen2022neuripsw-epistemic,
  title     = {{Epistemic Side Effects & Avoiding Them (Sometimes)}},
  author    = {Klassen, Toryn Q. and Alamdari, Parand Alizadeh and McIlraith, Sheila A.},
  booktitle = {NeurIPS 2022 Workshops: MLSW},
  year      = {2022},
  url       = {https://mlanthology.org/neuripsw/2022/klassen2022neuripsw-epistemic/}
}