Task-Relevant Adversarial Imitation Learning

Abstract

We show that a critical vulnerability in adversarial imitation is the tendency of discriminator networks to learn spurious associations between visual features and expert labels. When the discriminator focuses on task-irrelevant features, it does not provide an informative reward signal, leading to poor task performance. We analyze this problem in detail and propose a solution that outperforms standard Generative Adversarial Imitation Learning (GAIL). Our proposed method, Task-Relevant Adversarial Imitation Learning (TRAIL), uses constrained discriminator optimization to learn informative rewards. In comprehensive experiments, we show that TRAIL can solve challenging robotic manipulation tasks from pixels by imitating human operators without access to any task rewards, and clearly outperforms comparable baseline imitation agents, including those trained via behaviour cloning and conventional GAIL.

Cite

Text

Zolna et al. "Task-Relevant Adversarial Imitation Learning." Conference on Robot Learning, 2020.

Markdown

[Zolna et al. "Task-Relevant Adversarial Imitation Learning." Conference on Robot Learning, 2020.](https://mlanthology.org/corl/2020/zolna2020corl-taskrelevant/)

BibTeX

@inproceedings{zolna2020corl-taskrelevant,
  title     = {{Task-Relevant Adversarial Imitation Learning}},
  author    = {Zolna, Konrad and Reed, Scott and Novikov, Alexander and Colmenarejo, Sergio Gómez and Budden, David and Cabi, Serkan and Denil, Misha and de Freitas, Nando and Wang, Ziyu},
  booktitle = {Conference on Robot Learning},
  year      = {2020},
  pages     = {247-263},
  volume    = {155},
  url       = {https://mlanthology.org/corl/2020/zolna2020corl-taskrelevant/}
}