Unsupervised Learning of Temporal Abstractions Using Slot-Based Transformers

Abstract

The discovery of reusable sub-routines simplifies decision-making and planning in complex reinforcement learning problems. Previous approaches propose to learn such temporal abstractions in a purely unsupervised fashion through observing state-action trajectories gathered from executing a policy. However, a current limitation is that they process each trajectory in an entirely sequential manner, which prevents them from revising earlier decisions about sub-routine boundary points in light of new incoming information. In this work we propose SloTTAr, a fully parallel approach that integrates sequence processing Transformers with a Slot Attention module for learning about sub-routines in an unsupervised fashion. We demonstrate how SloTTAr is capable of outperforming strong baselines in terms of boundary point discovery, while being up to $30\mathrm{x}$ faster on existing benchmarks.

Cite

Text

Gopalakrishnan et al. "Unsupervised Learning of Temporal Abstractions Using Slot-Based Transformers." NeurIPS 2021 Workshops: DeepRL, 2021.

Markdown

[Gopalakrishnan et al. "Unsupervised Learning of Temporal Abstractions Using Slot-Based Transformers." NeurIPS 2021 Workshops: DeepRL, 2021.](https://mlanthology.org/neuripsw/2021/gopalakrishnan2021neuripsw-unsupervised/)

BibTeX

@inproceedings{gopalakrishnan2021neuripsw-unsupervised,
  title     = {{Unsupervised Learning of Temporal Abstractions Using Slot-Based Transformers}},
  author    = {Gopalakrishnan, Anand and Irie, Kazuki and Schmidhuber, Jürgen and van Steenkiste, Sjoerd},
  booktitle = {NeurIPS 2021 Workshops: DeepRL},
  year      = {2021},
  url       = {https://mlanthology.org/neuripsw/2021/gopalakrishnan2021neuripsw-unsupervised/}
}