Active Imitation Learning of Hierarchical Policies
Abstract
In this paper, we study the problem of imitation learning of hierarchical policies from demonstrations. The main difficulty in learning hierarchical policies by imitation is that the high level intention structure of the policy, which is often critical for understanding the demonstration, is unobserved. We formulate this problem as active learning of Probabilistic State-Dependent Grammars (PSDGs) from demonstrations. Given a set of expert demonstrations, our approach learns a hierarchical policy by actively selecting demonstrations and using queries to explicate their intentional structure at selected points. Our contributions include a new algorithm for imitation learning of hierarchical policies and principled heuristics for the selection of demonstrations and queries. Experimental results in five different domains exhibit successful learning using fewer queries than a variety of alternatives.
Cite
Text
Hamidi et al. "Active Imitation Learning of Hierarchical Policies." International Joint Conference on Artificial Intelligence, 2015.Markdown
[Hamidi et al. "Active Imitation Learning of Hierarchical Policies." International Joint Conference on Artificial Intelligence, 2015.](https://mlanthology.org/ijcai/2015/hamidi2015ijcai-active/)BibTeX
@inproceedings{hamidi2015ijcai-active,
title = {{Active Imitation Learning of Hierarchical Policies}},
author = {Hamidi, Mandana and Tadepalli, Prasad and Goetschalckx, Robby and Fern, Alan},
booktitle = {International Joint Conference on Artificial Intelligence},
year = {2015},
pages = {3554-3560},
url = {https://mlanthology.org/ijcai/2015/hamidi2015ijcai-active/}
}