Compositional Reinforcement Learning from Logical Specifications

Abstract

We study the problem of learning control policies for complex tasks given by logical specifications. Recent approaches automatically generate a reward function from a given specification and use a suitable reinforcement learning algorithm to learn a policy that maximizes the expected reward. These approaches, however, scale poorly to complex tasks that require high-level planning. In this work, we develop a compositional learning approach, called DIRL, that interleaves high-level planning and reinforcement learning. First, DIRL encodes the specification as an abstract graph; intuitively, vertices and edges of the graph correspond to regions of the state space and simpler sub-tasks, respectively. Our approach then incorporates reinforcement learning to learn neural network policies for each edge (sub-task) within a Dijkstra-style planning algorithm to compute a high-level plan in the graph. An evaluation of the proposed approach on a set of challenging control benchmarks with continuous state and action spaces demonstrates that it outperforms state-of-the-art baselines.

Cite

Text

Jothimurugan et al. "Compositional Reinforcement Learning from Logical Specifications." Neural Information Processing Systems, 2021.

Markdown

[Jothimurugan et al. "Compositional Reinforcement Learning from Logical Specifications." Neural Information Processing Systems, 2021.](https://mlanthology.org/neurips/2021/jothimurugan2021neurips-compositional/)

BibTeX

@inproceedings{jothimurugan2021neurips-compositional,
  title     = {{Compositional Reinforcement Learning from Logical Specifications}},
  author    = {Jothimurugan, Kishor and Bansal, Suguman and Bastani, Osbert and Alur, Rajeev},
  booktitle = {Neural Information Processing Systems},
  year      = {2021},
  url       = {https://mlanthology.org/neurips/2021/jothimurugan2021neurips-compositional/}
}