Sample-Efficient Adversarial Imitation Learning
Abstract
Imitation learning, wherein learning is performed by demonstration, has been studied and advanced for sequential decision-making tasks in which a reward function is not predefined. However, imitation learning methods still require numerous expert demonstration samples to successfully imitate an expert's behavior. To improve sample efficiency, we utilize self-supervised representation learning, which can generate vast training signals from the given data. In this study, we propose a self-supervised representation-based adversarial imitation learning method to learn state and action representations that are robust to diverse distortions and temporally predictive, on non-image control tasks. Particularly, in comparison with existing self-supervised learning methods for tabular data, we propose a different corruption method for state and action representations robust to diverse distortions. The proposed method shows a 39% relative improvement over the existing adversarial imitation learning methods on MuJoCo in a setting limited to 100 expert state-action pairs. Moreover, we conduct comprehensive ablations and additional experiments using demonstrations with varying optimality to provide the intuitions of a range of factors.
Cite
Text
Jung et al. "Sample-Efficient Adversarial Imitation Learning." NeurIPS 2022 Workshops: DeepRL, 2022.Markdown
[Jung et al. "Sample-Efficient Adversarial Imitation Learning." NeurIPS 2022 Workshops: DeepRL, 2022.](https://mlanthology.org/neuripsw/2022/jung2022neuripsw-sampleefficient/)BibTeX
@inproceedings{jung2022neuripsw-sampleefficient,
title = {{Sample-Efficient Adversarial Imitation Learning}},
author = {Jung, Dahuin and Lee, Hyungyu and Yoon, Sungroh},
booktitle = {NeurIPS 2022 Workshops: DeepRL},
year = {2022},
url = {https://mlanthology.org/neuripsw/2022/jung2022neuripsw-sampleefficient/}
}