Fast Learning with Predictive Forward Models

Abstract

A method for transforming performance evaluation signals distal both in space and time into proximal signals usable by supervised learning algo(cid:173) rithms, presented in [Jordan & Jacobs 90], is examined. A simple obser(cid:173) vation concerning differentiation through models trained with redundant inputs (as one of their networks is) explains a weakness in the original architecture and suggests a modification: an internal world model that encodes action-space exploration and, crucially, cancels input redundancy to the forward model is added. Learning time on an example task, cart(cid:173) pole balancing, is thereby reduced about 50 to 100 times.

Cite

Text

Brody. "Fast Learning with Predictive Forward Models." Neural Information Processing Systems, 1991.

Markdown

[Brody. "Fast Learning with Predictive Forward Models." Neural Information Processing Systems, 1991.](https://mlanthology.org/neurips/1991/brody1991neurips-fast/)

BibTeX

@inproceedings{brody1991neurips-fast,
  title     = {{Fast Learning with Predictive Forward Models}},
  author    = {Brody, Carlos},
  booktitle = {Neural Information Processing Systems},
  year      = {1991},
  pages     = {563-570},
  url       = {https://mlanthology.org/neurips/1991/brody1991neurips-fast/}
}