Discovering Agents (Abstract Reprint)
Abstract
Causal models of agents have been used to analyse the safety aspects of machine learning systems. But identifying agents is non-trivial – often the causal model is just assumed by the modeller without much justification – and modelling failures can lead to mistakes in the safety analysis. This paper proposes the first formal causal definition of agents – roughly that agents are systems that would adapt their policy if their actions influenced the world in a different way. From this we derive the first causal discovery algorithm for discovering the presence of agents from empirical data, given a set of variables and under certain assumptions. We also provide algorithms for translating between causal models and game-theoretic influence diagrams. We demonstrate our approach by resolving some previous confusions caused by incorrect causal modelling of agents.
Cite
Text
Kenton et al. "Discovering Agents (Abstract Reprint)." AAAI Conference on Artificial Intelligence, 2024. doi:10.1609/AAAI.V38I20.30601Markdown
[Kenton et al. "Discovering Agents (Abstract Reprint)." AAAI Conference on Artificial Intelligence, 2024.](https://mlanthology.org/aaai/2024/kenton2024aaai-discovering/) doi:10.1609/AAAI.V38I20.30601BibTeX
@inproceedings{kenton2024aaai-discovering,
title = {{Discovering Agents (Abstract Reprint)}},
author = {Kenton, Zachary and Kumar, Ramana and Farquhar, Sebastian and Richens, Jonathan and MacDermott, Matt and Everitt, Tom},
booktitle = {AAAI Conference on Artificial Intelligence},
year = {2024},
pages = {22701},
doi = {10.1609/AAAI.V38I20.30601},
url = {https://mlanthology.org/aaai/2024/kenton2024aaai-discovering/}
}