Automatic Discovery and Transfer of MAXQ Hierarchies
Abstract
We present an algorithm, HI-MAT (Hierarchy Induction via Models And Trajectories), that discovers MAXQ task hierarchies by applying dynamic Bayesian network models to a successful trajectory from a source reinforcement learning task. HI-MAT discovers subtasks by analyzing the causal and temporal relationships among the actions in the trajectory. Under appropriate assumptions, HI-MAT induces hierarchies that are consistent with the observed trajectory and have compact value-function tables employing safe state abstractions. We demonstrate empirically that HI-MAT constructs compact hierarchies that are comparable to manually-engineered hierarchies and facilitate significant speedup in learning when transferred to a target task.
Cite
Text
Mehta et al. "Automatic Discovery and Transfer of MAXQ Hierarchies." International Conference on Machine Learning, 2008. doi:10.1145/1390156.1390238Markdown
[Mehta et al. "Automatic Discovery and Transfer of MAXQ Hierarchies." International Conference on Machine Learning, 2008.](https://mlanthology.org/icml/2008/mehta2008icml-automatic/) doi:10.1145/1390156.1390238BibTeX
@inproceedings{mehta2008icml-automatic,
title = {{Automatic Discovery and Transfer of MAXQ Hierarchies}},
author = {Mehta, Neville and Ray, Soumya and Tadepalli, Prasad and Dietterich, Thomas G.},
booktitle = {International Conference on Machine Learning},
year = {2008},
pages = {648-655},
doi = {10.1145/1390156.1390238},
url = {https://mlanthology.org/icml/2008/mehta2008icml-automatic/}
}