Robust Inverse Reinforcement Learning Control with Unknown States
Abstract
This paper designs a robust inverse reinforcement learning (IRL) algorithm that observes the expert’s inputs and outputs to reconstruct the underlying cost function weights and optimal control policy for optimal discrete-time (DT) output feedback (OPFB) control systems while admitting disturbances and unknown states. The expert system is captured by a zero-sum game where its OPFB controller minimizes a cost function while robustly mitigating the effect of the worst disturbance, achieving a prescribed attenuation level. The inputs and outputs of the expert can be observed, but not the states. To enable the learner to replicate the behavior of the expert, we first develop a model-based IRL algorithm and subsequently design an equivalent model-free, data-driven version. This latter infers the quadratic cost function weights that can yield the expert’s static OPFB control policy, using output and input data of both the expert and learner. The convergence of the proposed algorithms is rigorously validated through theoretical analysis and numerical experiments.
Cite
Text
Lian et al. "Robust Inverse Reinforcement Learning Control with Unknown States." Proceedings of the 7th Annual Learning for Dynamics \& Control Conference, 2025.Markdown
[Lian et al. "Robust Inverse Reinforcement Learning Control with Unknown States." Proceedings of the 7th Annual Learning for Dynamics \& Control Conference, 2025.](https://mlanthology.org/l4dc/2025/lian2025l4dc-robust/)BibTeX
@inproceedings{lian2025l4dc-robust,
title = {{Robust Inverse Reinforcement Learning Control with Unknown States}},
author = {Lian, Bosen and Xue, Wenqian and Nguyen, Nhan},
booktitle = {Proceedings of the 7th Annual Learning for Dynamics \& Control Conference},
year = {2025},
pages = {750-762},
volume = {283},
url = {https://mlanthology.org/l4dc/2025/lian2025l4dc-robust/}
}