Visual Adversarial Imitation Learning Using Variational Models
Abstract
Reward function specification, which requires considerable human effort and iteration, remains a major impediment for learning behaviors through deep reinforcement learning. In contrast, providing visual demonstrations of desired behaviors presents an easier and more natural way to teach agents. We consider a setting where an agent is provided a fixed dataset of visual demonstrations illustrating how to perform a task, and must learn to solve the task using the provided demonstrations and unsupervised environment interactions. This setting presents a number of challenges including representation learning for visual observations, sample complexity due to high dimensional spaces, and learning instability due to the lack of a fixed reward or learning signal. Towards addressing these challenges, we develop a variational model-based adversarial imitation learning (V-MAIL) algorithm. The model-based approach provides a strong signal for representation learning, enables sample efficiency, and improves the stability of adversarial training by enabling on-policy learning. Through experiments involving several vision-based locomotion and manipulation tasks, we find that V-MAIL learns successful visuomotor policies in a sample-efficient manner, has better stability compared to prior work, and also achieves higher asymptotic performance. We further find that by transferring the learned models, V-MAIL can learn new tasks from visual demonstrations without any additional environment interactions. All results including videos can be found online at https://sites.google.com/view/variational-mail
Cite
Text
Rafailov et al. "Visual Adversarial Imitation Learning Using Variational Models." Neural Information Processing Systems, 2021.Markdown
[Rafailov et al. "Visual Adversarial Imitation Learning Using Variational Models." Neural Information Processing Systems, 2021.](https://mlanthology.org/neurips/2021/rafailov2021neurips-visual/)BibTeX
@inproceedings{rafailov2021neurips-visual,
title = {{Visual Adversarial Imitation Learning Using Variational Models}},
author = {Rafailov, Rafael and Yu, Tianhe and Rajeswaran, Aravind and Finn, Chelsea},
booktitle = {Neural Information Processing Systems},
year = {2021},
url = {https://mlanthology.org/neurips/2021/rafailov2021neurips-visual/}
}