Dynamics Adapted Imitation Learning

Abstract

We consider Imitation Learning with dynamics variation between the expert demonstration (source domain) and the environment (target domain). Based on the popular framework of Adversarial Imitation Learning, we propose a novel algorithm – Dynamics Adapted Imitation Learning (DYNAIL), which incorporates the dynamics variation into the state-action occupancy measure matching as a regularization term. The dynamics variation is modeled by a pair of classifiers to distinguish between source dynamics and target dynamics. Theoretically, we provide an upper bound on the divergence between the learned policy and expert demonstrations in the source domain. Our error bound only depends on the expectation of the discrepancy between the source and target dynamics for the optimal policy in the target domain. The experiment evaluation validates that our method achieves superior results on high dimensional continuous control tasks, compared to existing imitation learning methods

Cite

Text

Liu et al. "Dynamics Adapted Imitation Learning." Transactions on Machine Learning Research, 2023.

Markdown

[Liu et al. "Dynamics Adapted Imitation Learning." Transactions on Machine Learning Research, 2023.](https://mlanthology.org/tmlr/2023/liu2023tmlr-dynamics/)

BibTeX

@article{liu2023tmlr-dynamics,
  title     = {{Dynamics Adapted Imitation Learning}},
  author    = {Liu, Zixuan and Liu, Liu and Wu, Bingzhe and Li, Lanqing and Wang, Xueqian and Yuan, Bo and Zhao, Peilin},
  journal   = {Transactions on Machine Learning Research},
  year      = {2023},
  url       = {https://mlanthology.org/tmlr/2023/liu2023tmlr-dynamics/}
}