Automatic Discovery and Transfer of MAXQ Hierarchies

Abstract

We present an algorithm, HI-MAT (Hierarchy Induction via Models And Trajectories), that discovers MAXQ task hierarchies by applying dynamic Bayesian network models to a successful trajectory from a source reinforcement learning task. HI-MAT discovers subtasks by analyzing the causal and temporal relationships among the actions in the trajectory. Under appropriate assumptions, HI-MAT induces hierarchies that are consistent with the observed trajectory and have compact value-function tables employing safe state abstractions. We demonstrate empirically that HI-MAT constructs compact hierarchies that are comparable to manually-engineered hierarchies and facilitate significant speedup in learning when transferred to a target task.

Cite

Text

Mehta et al. "Automatic Discovery and Transfer of MAXQ Hierarchies." International Conference on Machine Learning, 2008. doi:10.1145/1390156.1390238

Markdown

[Mehta et al. "Automatic Discovery and Transfer of MAXQ Hierarchies." International Conference on Machine Learning, 2008.](https://mlanthology.org/icml/2008/mehta2008icml-automatic/) doi:10.1145/1390156.1390238

BibTeX

@inproceedings{mehta2008icml-automatic,
  title     = {{Automatic Discovery and Transfer of MAXQ Hierarchies}},
  author    = {Mehta, Neville and Ray, Soumya and Tadepalli, Prasad and Dietterich, Thomas G.},
  booktitle = {International Conference on Machine Learning},
  year      = {2008},
  pages     = {648-655},
  doi       = {10.1145/1390156.1390238},
  url       = {https://mlanthology.org/icml/2008/mehta2008icml-automatic/}
}