Reinforcement Learning in Markovian and Non-Markovian Environments

NeurIPS 1990 pp. 500-506

/neurips/1990/schmidhuber1990neurips-reinforcement/

Abstract

This work addresses three problems with reinforcement learning and adap(cid:173) tive neuro-control: 1. Non-Markovian interfaces between learner and en(cid:173) vironment. 2. On-line learning based on system realization. 3. Vector(cid:173) valued adaptive critics. An algorithm is described which is based on system realization and on two interacting fully recurrent continually running net(cid:173) works which may learn in parallel. Problems with parallel learning are attacked by 'adaptive randomness'. It is also described how interacting model/controller systems can be combined with vector-valued 'adaptive critics' (previous critics have been scalar).

PDF NeurIPS Semantic Scholar

Cite

Text

Schmidhuber. "Reinforcement Learning in Markovian and Non-Markovian Environments." Neural Information Processing Systems, 1990.

Markdown

[Schmidhuber. "Reinforcement Learning in Markovian and Non-Markovian Environments." Neural Information Processing Systems, 1990.](https://mlanthology.org/neurips/1990/schmidhuber1990neurips-reinforcement/)

BibTeX

@inproceedings{schmidhuber1990neurips-reinforcement,
  title     = {{Reinforcement Learning in Markovian and Non-Markovian Environments}},
  author    = {Schmidhuber, Jürgen},
  booktitle = {Neural Information Processing Systems},
  year      = {1990},
  pages     = {500-506},
  url       = {https://mlanthology.org/neurips/1990/schmidhuber1990neurips-reinforcement/}
}