Model-Based Adversarial Imitation Learning as Online Fine-Tuning

Abstract

In many real world applications of sequential decision-making problems, such as robotics or autonomous driving, expert-level data is available (or easily obtainable) with methods such as tele-operation. However, directly learning to copy these expert behaviours can result in poor performance due to distribution shift at deployment time. Adversarial imitation learning algorithms alleviate this issue by learning to match the expert state-action distribution through additional environment interactions. Such methods are built around standard reinforcement-learning algorithms with both model-based and model-free approaches. In this work we focus on the model-based approach and argue that algorithms developed for online RL are sub-optimal for the distribution matching problem. We theoretically justify utilizing conservative algorithms developed for the offline learning paradigm in online adversarial imitation learning and empirically demonstrate improved performance and safety on a complex long-range robot manipulation task, directly from images.

Cite

Text

Rafailov et al. "Model-Based Adversarial Imitation Learning as Online Fine-Tuning." ICLR 2023 Workshops: RRL, 2023.

Markdown

[Rafailov et al. "Model-Based Adversarial Imitation Learning as Online Fine-Tuning." ICLR 2023 Workshops: RRL, 2023.](https://mlanthology.org/iclrw/2023/rafailov2023iclrw-modelbased/)

BibTeX

@inproceedings{rafailov2023iclrw-modelbased,
  title     = {{Model-Based Adversarial Imitation Learning as Online Fine-Tuning}},
  author    = {Rafailov, Rafael and Kolev, Victor and Hatch, Kyle Beltran and Martin, John D and Phielipp, Mariano and Wu, Jiajun and Finn, Chelsea},
  booktitle = {ICLR 2023 Workshops: RRL},
  year      = {2023},
  url       = {https://mlanthology.org/iclrw/2023/rafailov2023iclrw-modelbased/}
}