Q-Learning with Hidden-Unit Restarting

Abstract

Platt's resource-allocation network (RAN) (Platt, 1991a, 1991b) is modified for a reinforcement-learning paradigm and to "restart" existing hidden units rather than adding new units. After restart(cid:173) ing, units continue to learn via back-propagation. The resulting restart algorithm is tested in a Q-Iearning network that learns to solve an inverted pendulum problem. Solutions are found faster on average with the restart algorithm than without it.

Cite

Text

Anderson. "Q-Learning with Hidden-Unit Restarting." Neural Information Processing Systems, 1992.

Markdown

[Anderson. "Q-Learning with Hidden-Unit Restarting." Neural Information Processing Systems, 1992.](https://mlanthology.org/neurips/1992/anderson1992neurips-qlearning/)

BibTeX

@inproceedings{anderson1992neurips-qlearning,
  title     = {{Q-Learning with Hidden-Unit Restarting}},
  author    = {Anderson, Charles W.},
  booktitle = {Neural Information Processing Systems},
  year      = {1992},
  pages     = {81-88},
  url       = {https://mlanthology.org/neurips/1992/anderson1992neurips-qlearning/}
}