Reinforcement Learning of Causal Variables Using Mediation Analysis

Abstract

We consider the problem of acquiring causal representations and concepts in a reinforcement learning setting. Our approach defines a causal variable as being both manipulable by a policy, and able to predict the outcome. We thereby obtain a parsimonious causal graph in which interventions occur at the level of policies. The approach avoids defining a generative model of the data, prior pre-processing, or learning the transition kernel of the Markov decision process. Instead, causal variables and policies are determined by maximizing a new optimization target inspired by mediation analysis, which differs from the expected return. The maximization is accomplished using a generalization of Bellman's equation which is shown to converge, and the method finds meaningful causal representations in a simulated environment.

Cite

Text

Herlau and Larsen. "Reinforcement Learning of Causal Variables Using Mediation Analysis." AAAI Conference on Artificial Intelligence, 2022. doi:10.1609/AAAI.V36I6.20648

Markdown

[Herlau and Larsen. "Reinforcement Learning of Causal Variables Using Mediation Analysis." AAAI Conference on Artificial Intelligence, 2022.](https://mlanthology.org/aaai/2022/herlau2022aaai-reinforcement/) doi:10.1609/AAAI.V36I6.20648

BibTeX

@inproceedings{herlau2022aaai-reinforcement,
  title     = {{Reinforcement Learning of Causal Variables Using Mediation Analysis}},
  author    = {Herlau, Tue and Larsen, Rasmus},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {2022},
  pages     = {6910-6917},
  doi       = {10.1609/AAAI.V36I6.20648},
  url       = {https://mlanthology.org/aaai/2022/herlau2022aaai-reinforcement/}
}