Feature Construction for Inverse Reinforcement Learning

Abstract

The goal of inverse reinforcement learning is to find a reward function for a Markov decision process, given example traces from its optimal policy. Current IRL techniques generally rely on user-supplied features that form a concise basis for the reward. We present an algorithm that instead constructs reward features from a large collection of component features, by building logical conjunctions of those component features that are relevant to the example policy. Given example traces, the algorithm returns a reward function as well as the constructed features. The reward function can be used to recover a full, deterministic, stationary policy, and the features can be used to transplant the reward function into any novel environment on which the component features are well defined.

Cite

Text

Levine et al. "Feature Construction for Inverse Reinforcement Learning." Neural Information Processing Systems, 2010.

Markdown

[Levine et al. "Feature Construction for Inverse Reinforcement Learning." Neural Information Processing Systems, 2010.](https://mlanthology.org/neurips/2010/levine2010neurips-feature/)

BibTeX

@inproceedings{levine2010neurips-feature,
  title     = {{Feature Construction for Inverse Reinforcement Learning}},
  author    = {Levine, Sergey and Popovic, Zoran and Koltun, Vladlen},
  booktitle = {Neural Information Processing Systems},
  year      = {2010},
  pages     = {1342-1350},
  url       = {https://mlanthology.org/neurips/2010/levine2010neurips-feature/}
}