Reinforcement Learning in Markovian and Non-Markovian Environments
Abstract
This work addresses three problems with reinforcement learning and adap(cid:173) tive neuro-control: 1. Non-Markovian interfaces between learner and en(cid:173) vironment. 2. On-line learning based on system realization. 3. Vector(cid:173) valued adaptive critics. An algorithm is described which is based on system realization and on two interacting fully recurrent continually running net(cid:173) works which may learn in parallel. Problems with parallel learning are attacked by 'adaptive randomness'. It is also described how interacting model/controller systems can be combined with vector-valued 'adaptive critics' (previous critics have been scalar).
Cite
Text
Schmidhuber. "Reinforcement Learning in Markovian and Non-Markovian Environments." Neural Information Processing Systems, 1990.Markdown
[Schmidhuber. "Reinforcement Learning in Markovian and Non-Markovian Environments." Neural Information Processing Systems, 1990.](https://mlanthology.org/neurips/1990/schmidhuber1990neurips-reinforcement/)BibTeX
@inproceedings{schmidhuber1990neurips-reinforcement,
title = {{Reinforcement Learning in Markovian and Non-Markovian Environments}},
author = {Schmidhuber, Jürgen},
booktitle = {Neural Information Processing Systems},
year = {1990},
pages = {500-506},
url = {https://mlanthology.org/neurips/1990/schmidhuber1990neurips-reinforcement/}
}