Reinforcement Learning with Long Short-Term Memory
Abstract
This paper presents reinforcement learning with a Long Short(cid:173) Term Memory recurrent neural network: RL-LSTM. Model-free RL-LSTM using Advantage(,x) learning and directed exploration can solve non-Markovian tasks with long-term dependencies be(cid:173) tween relevant events. This is demonstrated in a T-maze task, as well as in a difficult variation of the pole balancing task.
Cite
Text
Bakker. "Reinforcement Learning with Long Short-Term Memory." Neural Information Processing Systems, 2001.Markdown
[Bakker. "Reinforcement Learning with Long Short-Term Memory." Neural Information Processing Systems, 2001.](https://mlanthology.org/neurips/2001/bakker2001neurips-reinforcement/)BibTeX
@inproceedings{bakker2001neurips-reinforcement,
title = {{Reinforcement Learning with Long Short-Term Memory}},
author = {Bakker, Bram},
booktitle = {Neural Information Processing Systems},
year = {2001},
pages = {1475-1482},
url = {https://mlanthology.org/neurips/2001/bakker2001neurips-reinforcement/}
}