Structured Apprenticeship Learning

Abstract

We propose a graph-based algorithm for apprenticeship learning when the reward features are noisy. Previous apprenticeship learning techniques learn a reward function by using only local state features. This can be a limitation in practice, as often some features are misspecified or subject to measurement noise. Our graphical framework, inspired from the work on Markov Random Fields, allows to alleviate this problem by propagating information between states, and rewarding policies that choose similar actions in adjacent states. We demonstrate the advantage of the proposed approach on grid-world navigation problems, and on the problem of teaching a robot to grasp novel objects in simulation.

Cite

Text

Boularias et al. "Structured Apprenticeship Learning." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2012. doi:10.1007/978-3-642-33486-3_15

Markdown

[Boularias et al. "Structured Apprenticeship Learning." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2012.](https://mlanthology.org/ecmlpkdd/2012/boularias2012ecmlpkdd-structured/) doi:10.1007/978-3-642-33486-3_15

BibTeX

@inproceedings{boularias2012ecmlpkdd-structured,
  title     = {{Structured Apprenticeship Learning}},
  author    = {Boularias, Abdeslam and Krömer, Oliver and Peters, Jan},
  booktitle = {European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases},
  year      = {2012},
  pages     = {227-242},
  doi       = {10.1007/978-3-642-33486-3_15},
  url       = {https://mlanthology.org/ecmlpkdd/2012/boularias2012ecmlpkdd-structured/}
}