Compositional Reinforcement Learning from Logical Specifications
Abstract
We study the problem of learning control policies for complex tasks given by logical specifications. Recent approaches automatically generate a reward function from a given specification and use a suitable reinforcement learning algorithm to learn a policy that maximizes the expected reward. These approaches, however, scale poorly to complex tasks that require high-level planning. In this work, we develop a compositional learning approach, called DIRL, that interleaves high-level planning and reinforcement learning. First, DIRL encodes the specification as an abstract graph; intuitively, vertices and edges of the graph correspond to regions of the state space and simpler sub-tasks, respectively. Our approach then incorporates reinforcement learning to learn neural network policies for each edge (sub-task) within a Dijkstra-style planning algorithm to compute a high-level plan in the graph. An evaluation of the proposed approach on a set of challenging control benchmarks with continuous state and action spaces demonstrates that it outperforms state-of-the-art baselines.
Cite
Text
Jothimurugan et al. "Compositional Reinforcement Learning from Logical Specifications." Neural Information Processing Systems, 2021.Markdown
[Jothimurugan et al. "Compositional Reinforcement Learning from Logical Specifications." Neural Information Processing Systems, 2021.](https://mlanthology.org/neurips/2021/jothimurugan2021neurips-compositional/)BibTeX
@inproceedings{jothimurugan2021neurips-compositional,
title = {{Compositional Reinforcement Learning from Logical Specifications}},
author = {Jothimurugan, Kishor and Bansal, Suguman and Bastani, Osbert and Alur, Rajeev},
booktitle = {Neural Information Processing Systems},
year = {2021},
url = {https://mlanthology.org/neurips/2021/jothimurugan2021neurips-compositional/}
}