Variable-Agnostic Causal Exploration for Reinforcement Learning

Abstract

Modern reinforcement learning (RL) struggles to capture real-world cause-and-effect dynamics, leading to inefficient exploration due to extensive trial-and-error actions. While recent efforts to improve agent exploration have leveraged causal discovery, they often make unrealistic assumptions of causal variables in the environments. In this paper, we introduce a novel framework, Variable-Agnostic Causal Exploration for Reinforcement Learning (VACERL), incorporating causal relationships to drive exploration in RL without specifying environmental causal variables. Our approach automatically identifies crucial observation-action steps associated with key variables using attention mechanisms. Subsequently, it constructs the causal graph connecting these steps, which guides the agent towards observation-action pairs with greater causal influence on task completion. This can be leveraged to generate intrinsic rewards or establish a hierarchy of subgoals to enhance exploration efficiency. Experimental results showcase a significant improvement in agent performance in grid-world, 2d games and robotic domains, particularly in scenarios with sparse rewards and noisy actions, such as the notorious Noisy-TV environments.

Cite

Text

Nguyen et al. "Variable-Agnostic Causal Exploration for Reinforcement Learning." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2024. doi:10.1007/978-3-031-70344-7_13

Markdown

[Nguyen et al. "Variable-Agnostic Causal Exploration for Reinforcement Learning." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2024.](https://mlanthology.org/ecmlpkdd/2024/nguyen2024ecmlpkdd-variableagnostic/) doi:10.1007/978-3-031-70344-7_13

BibTeX

@inproceedings{nguyen2024ecmlpkdd-variableagnostic,
  title     = {{Variable-Agnostic Causal Exploration for Reinforcement Learning}},
  author    = {Nguyen, Minh Hoang and Le, Hung and Venkatesh, Svetha},
  booktitle = {European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases},
  year      = {2024},
  pages     = {216-232},
  doi       = {10.1007/978-3-031-70344-7_13},
  url       = {https://mlanthology.org/ecmlpkdd/2024/nguyen2024ecmlpkdd-variableagnostic/}
}