Improving Generalization for Temporal Difference Learning: The Successor Representation

Dayan, Peter

doi:10.1162/NECO.1993.5.4.613

Improving Generalization for Temporal Difference Learning: The Successor Representation

Peter Dayan

NeCo 1993 pp. 613-624

doi:10.1162/NECO.1993.5.4.613 /neco/1993/dayan1993neco-improving/

Abstract

Estimation of returns over time, the focus of temporal difference (TD) algorithms, imposes particular constraints on good function approximators or representations. Appropriate generalization between states is determined by how similar their successors are, and representations should follow suit. This paper shows how TD machinery can be used to learn such representations, and illustrates, using a navigation task, the appropriately distributed nature of the result.

PDF NeCo Semantic Scholar

Cite

Text

Dayan. "Improving Generalization for Temporal Difference Learning: The Successor Representation." Neural Computation, 1993. doi:10.1162/NECO.1993.5.4.613

Markdown

[Dayan. "Improving Generalization for Temporal Difference Learning: The Successor Representation." Neural Computation, 1993.](https://mlanthology.org/neco/1993/dayan1993neco-improving/) doi:10.1162/NECO.1993.5.4.613

BibTeX

@article{dayan1993neco-improving,
  title     = {{Improving Generalization for Temporal Difference Learning: The Successor Representation}},
  author    = {Dayan, Peter},
  journal   = {Neural Computation},
  year      = {1993},
  pages     = {613-624},
  doi       = {10.1162/NECO.1993.5.4.613},
  volume    = {5},
  url       = {https://mlanthology.org/neco/1993/dayan1993neco-improving/}
}