Imitation with Neural Density Models
Abstract
We propose a new framework for Imitation Learning (IL) via density estimation of the expert's occupancy measure followed by Maximum Occupancy Entropy Reinforcement Learning (RL) using the density as a reward. Our approach maximizes a non-adversarial model-free RL objective that provably lower bounds reverse Kullback–Leibler divergence between occupancy measures of the expert and imitator. We present a practical IL algorithm, Neural Density Imitation (NDI), which obtains state-of-the-art demonstration efficiency on benchmark control tasks.
Cite
Text
Kim et al. "Imitation with Neural Density Models." Neural Information Processing Systems, 2021.Markdown
[Kim et al. "Imitation with Neural Density Models." Neural Information Processing Systems, 2021.](https://mlanthology.org/neurips/2021/kim2021neurips-imitation/)BibTeX
@inproceedings{kim2021neurips-imitation,
title = {{Imitation with Neural Density Models}},
author = {Kim, Kuno and Jindal, Akshat and Song, Yang and Song, Jiaming and Sui, Yanan and Ermon, Stefano},
booktitle = {Neural Information Processing Systems},
year = {2021},
url = {https://mlanthology.org/neurips/2021/kim2021neurips-imitation/}
}