Imitation with Neural Density Models

Abstract

We propose a new framework for Imitation Learning (IL) via density estimation of the expert's occupancy measure followed by Maximum Occupancy Entropy Reinforcement Learning (RL) using the density as a reward. Our approach maximizes a non-adversarial model-free RL objective that provably lower bounds reverse Kullback–Leibler divergence between occupancy measures of the expert and imitator. We present a practical IL algorithm, Neural Density Imitation (NDI), which obtains state-of-the-art demonstration efficiency on benchmark control tasks.

Cite

Text

Kim et al. "Imitation with Neural Density Models." Neural Information Processing Systems, 2021.

Markdown

[Kim et al. "Imitation with Neural Density Models." Neural Information Processing Systems, 2021.](https://mlanthology.org/neurips/2021/kim2021neurips-imitation/)

BibTeX

@inproceedings{kim2021neurips-imitation,
  title     = {{Imitation with Neural Density Models}},
  author    = {Kim, Kuno and Jindal, Akshat and Song, Yang and Song, Jiaming and Sui, Yanan and Ermon, Stefano},
  booktitle = {Neural Information Processing Systems},
  year      = {2021},
  url       = {https://mlanthology.org/neurips/2021/kim2021neurips-imitation/}
}