DeepSynth: Automata Synthesis for Automatic Task Segmentation in Deep Reinforcement Learning

Abstract

This paper proposes DeepSynth, a method for effective training of deep Reinforcement Learning (RL) agents when the reward is sparse and non-Markovian, but at the same time progress towards the reward requires achieving an unknown sequence of high-level objectives. Our method employs a novel algorithm for synthesis of compact automata to uncover this sequential structure automatically. We synthesise a human-interpretable automaton from trace data collected by exploring the environment. The state space of the environment is then enriched with the synthesised automaton so that the generation of a control policy by deep RL is guided by the discovered structure encoded in the automaton. The proposed approach is able to cope with both high-dimensional, low-level features and unknown sparse non-Markovian rewards. We have evaluated DeepSynth's performance in a set of experiments that includes the Atari game Montezuma's Revenge. Compared to existing approaches, we obtain a reduction of two orders of magnitude in the number of iterations required for policy synthesis, and also a significant improvement in scalability.

Cite

Text

Hasanbeig et al. "DeepSynth: Automata Synthesis for Automatic Task Segmentation in Deep Reinforcement Learning." AAAI Conference on Artificial Intelligence, 2021. doi:10.1609/AAAI.V35I9.16935

Markdown

[Hasanbeig et al. "DeepSynth: Automata Synthesis for Automatic Task Segmentation in Deep Reinforcement Learning." AAAI Conference on Artificial Intelligence, 2021.](https://mlanthology.org/aaai/2021/hasanbeig2021aaai-deepsynth/) doi:10.1609/AAAI.V35I9.16935

BibTeX

@inproceedings{hasanbeig2021aaai-deepsynth,
  title     = {{DeepSynth: Automata Synthesis for Automatic Task Segmentation in Deep Reinforcement Learning}},
  author    = {Hasanbeig, Mohammadhosein and Jeppu, Natasha Yogananda and Abate, Alessandro and Melham, Tom and Kroening, Daniel},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {2021},
  pages     = {7647-7656},
  doi       = {10.1609/AAAI.V35I9.16935},
  url       = {https://mlanthology.org/aaai/2021/hasanbeig2021aaai-deepsynth/}
}