Expediting Reinforcement Learning by Incorporating Knowledge About Temporal Causality in the Environment

Abstract

Reinforcement learning (RL) algorithms struggle with learning optimal policies for tasks where reward feedback is sparse and depends on a complex sequence of events in the environment. Probabilistic reward machines (PRMs) are finite-state formalisms that can capture temporal dependencies in the reward signal, along with nondeterministic task outcomes. While special RL algorithms can exploit this finite-state structure to expedite learning, PRMs remain difficult to modify and design by hand. This hinders the already difficult tasks of utilizing high-level causal knowledge about the environment, and transferring the reward formalism into a new domain with a different causal structure. This paper proposes a novel method to incorporate causal information in the form of Temporal Logic-based Causal Diagrams into the reward formalism, thereby expediting policy learning and aiding the transfer of task specifications to new environments. Furthermore, we provide a theoretical result about convergence to optimal policy for our method, and demonstrate its strengths empirically.

Cite

Text

Corazza et al. "Expediting Reinforcement Learning by Incorporating Knowledge About Temporal Causality in the Environment." Proceedings of the Third Conference on Causal Learning and Reasoning, 2024.

Markdown

[Corazza et al. "Expediting Reinforcement Learning by Incorporating Knowledge About Temporal Causality in the Environment." Proceedings of the Third Conference on Causal Learning and Reasoning, 2024.](https://mlanthology.org/clear/2024/corazza2024clear-expediting/)

BibTeX

@inproceedings{corazza2024clear-expediting,
  title     = {{Expediting Reinforcement Learning by Incorporating Knowledge About Temporal Causality in the Environment}},
  author    = {Corazza, Jan and Aria, Hadi Partovi and Neider, Daniel and Xu, Zhe},
  booktitle = {Proceedings of the Third Conference on Causal Learning and Reasoning},
  year      = {2024},
  pages     = {643-664},
  volume    = {236},
  url       = {https://mlanthology.org/clear/2024/corazza2024clear-expediting/}
}