Model Predictive Adversarial Imitation Learning for Planning from Observation
Abstract
Humans can often perform a new task after observing a few demonstrations by inferring the underlying intent. For robots, recovering the intent of the demonstrator through a learned reward function can enable more efficient, interpretable, and robust imitation through planning. A common paradigm for learning how to plan-from-demonstration involves first solving for a reward via Inverse Reinforcement Learning (IRL) and then deploying it via Model Predictive Control (MPC). In this work, we unify these two procedures by introducing planning-based Adversarial Imitation Learning, which simultaneously learns a reward and improves a planning-based agent through experience while using observation-only demonstrations. We study advantages of planning-based AIL in generalization, interpretability, robustness, and sample efficiency through experiments in simulated control tasks and real-world navigation from few or single observation-only demonstration.
Cite
Text
Han et al. "Model Predictive Adversarial Imitation Learning for Planning from Observation." International Conference on Learning Representations, 2026.Markdown
[Han et al. "Model Predictive Adversarial Imitation Learning for Planning from Observation." International Conference on Learning Representations, 2026.](https://mlanthology.org/iclr/2026/han2026iclr-model/)BibTeX
@inproceedings{han2026iclr-model,
title = {{Model Predictive Adversarial Imitation Learning for Planning from Observation}},
author = {Han, Tyler and Bao, Yanda and Mehta, Bhaumik and Guo, Gabriel and Jung, Sanghun and Vishwakarma, Anubhav and Kang, Emily and Scalise, Rosario and Zhou, Jason Liren and Xu, Bryan and Boots, Byron},
booktitle = {International Conference on Learning Representations},
year = {2026},
url = {https://mlanthology.org/iclr/2026/han2026iclr-model/}
}