Learning Macro-Actions in Reinforcement Learning

Abstract

We present a method for automatically constructing macro-actions from scratch from primitive actions during the reinforcement learning process. The overall idea is to reinforce the tendency to perform action b after action a if such a pattern of actions has been rewarded. We test the method on a bicycle task, the car-on-the-hill task, the race-track task and some grid-world tasks. For the bicycle and race-track tasks the use of macro-actions approximately halves the learning time, while for one of the grid-world tasks the learning time is reduced by a factor of 5. The method did not work for the car-on-the-hill task for reasons we discuss in the conclusion.

Cite

Text

Randlov. "Learning Macro-Actions in Reinforcement Learning." Neural Information Processing Systems, 1998.

Markdown

[Randlov. "Learning Macro-Actions in Reinforcement Learning." Neural Information Processing Systems, 1998.](https://mlanthology.org/neurips/1998/randlov1998neurips-learning/)

BibTeX

@inproceedings{randlov1998neurips-learning,
  title     = {{Learning Macro-Actions in Reinforcement Learning}},
  author    = {Randlov, Jette},
  booktitle = {Neural Information Processing Systems},
  year      = {1998},
  pages     = {1045-1051},
  url       = {https://mlanthology.org/neurips/1998/randlov1998neurips-learning/}
}