Temporal Difference Variational Auto-Encoder

Abstract

To act and plan in complex environments, we posit that agents should have a mental simulator of the world with three characteristics: (a) it should build an abstract state representing the condition of the world; (b) it should form a belief which represents uncertainty on the world; (c) it should go beyond simple step-by-step simulation, and exhibit temporal abstraction. Motivated by the absence of a model satisfying all these requirements, we propose TD-VAE, a generative sequence model that learns representations containing explicit beliefs about states several steps into the future, and that can be rolled out directly without single-step transitions. TD-VAE is trained on pairs of temporally separated time points, using an analogue of temporal difference learning used in reinforcement learning.

Cite

Text

Gregor et al. "Temporal Difference Variational Auto-Encoder." International Conference on Learning Representations, 2019.

Markdown

[Gregor et al. "Temporal Difference Variational Auto-Encoder." International Conference on Learning Representations, 2019.](https://mlanthology.org/iclr/2019/gregor2019iclr-temporal/)

BibTeX

@inproceedings{gregor2019iclr-temporal,
  title     = {{Temporal Difference Variational Auto-Encoder}},
  author    = {Gregor, Karol and Papamakarios, George and Besse, Frederic and Buesing, Lars and Weber, Theophane},
  booktitle = {International Conference on Learning Representations},
  year      = {2019},
  url       = {https://mlanthology.org/iclr/2019/gregor2019iclr-temporal/}
}