Q-Learning with Hidden-Unit Restarting
Abstract
Platt's resource-allocation network (RAN) (Platt, 1991a, 1991b) is modified for a reinforcement-learning paradigm and to "restart" existing hidden units rather than adding new units. After restart(cid:173) ing, units continue to learn via back-propagation. The resulting restart algorithm is tested in a Q-Iearning network that learns to solve an inverted pendulum problem. Solutions are found faster on average with the restart algorithm than without it.
Cite
Text
Anderson. "Q-Learning with Hidden-Unit Restarting." Neural Information Processing Systems, 1992.Markdown
[Anderson. "Q-Learning with Hidden-Unit Restarting." Neural Information Processing Systems, 1992.](https://mlanthology.org/neurips/1992/anderson1992neurips-qlearning/)BibTeX
@inproceedings{anderson1992neurips-qlearning,
title = {{Q-Learning with Hidden-Unit Restarting}},
author = {Anderson, Charles W.},
booktitle = {Neural Information Processing Systems},
year = {1992},
pages = {81-88},
url = {https://mlanthology.org/neurips/1992/anderson1992neurips-qlearning/}
}