Imitation Learning by Reinforcement Learning

Abstract

Imitation learning algorithms learn a policy from demonstrations of expert behavior. We show that, for deterministic experts, imitation learning can be done by reduction to reinforcement learning with a stationary reward. Our theoretical analysis both certifies the recovery of expert reward and bounds the total variation distance between the expert and the imitation learner, showing a link to adversarial imitation learning. We conduct experiments which confirm that our reduction works well in practice for continuous control tasks.

Cite

Text

Ciosek. "Imitation Learning by Reinforcement Learning." International Conference on Learning Representations, 2022.

Markdown

[Ciosek. "Imitation Learning by Reinforcement Learning." International Conference on Learning Representations, 2022.](https://mlanthology.org/iclr/2022/ciosek2022iclr-imitation/)

BibTeX

@inproceedings{ciosek2022iclr-imitation,
  title     = {{Imitation Learning by Reinforcement Learning}},
  author    = {Ciosek, Kamil},
  booktitle = {International Conference on Learning Representations},
  year      = {2022},
  url       = {https://mlanthology.org/iclr/2022/ciosek2022iclr-imitation/}
}