Hierarchical Learning in Stochastic Domains: Preliminary Results
Abstract
This paper presents the HDG learning algorithm, which uses a hierarchical decomposition of the state space to make learning to achieve goals more efficient with a small penalty in path quality. Special care must be taken when performing hierarchical planning and learning in stochastic domains, because macro-operators cannot be executed ballistically. The HDG algorithm, which is a descendent of Watkins' Q-learning algorithm, is described here and preliminary empirical results are presented.
Cite
Text
Kaelbling. "Hierarchical Learning in Stochastic Domains: Preliminary Results." International Conference on Machine Learning, 1993. doi:10.1016/B978-1-55860-307-3.50028-9Markdown
[Kaelbling. "Hierarchical Learning in Stochastic Domains: Preliminary Results." International Conference on Machine Learning, 1993.](https://mlanthology.org/icml/1993/kaelbling1993icml-hierarchical/) doi:10.1016/B978-1-55860-307-3.50028-9BibTeX
@inproceedings{kaelbling1993icml-hierarchical,
title = {{Hierarchical Learning in Stochastic Domains: Preliminary Results}},
author = {Kaelbling, Leslie Pack},
booktitle = {International Conference on Machine Learning},
year = {1993},
pages = {167-173},
doi = {10.1016/B978-1-55860-307-3.50028-9},
url = {https://mlanthology.org/icml/1993/kaelbling1993icml-hierarchical/}
}