Meta Inverse Constrained Reinforcement Learning: Convergence Guarantee and Generalization Analysis

Abstract

This paper considers the problem of learning the reward function and constraints of an expert from few demonstrations. This problem can be considered as a meta-learning problem where we first learn meta-priors over reward functions and constraints from other distinct but related tasks and then adapt the learned meta-priors to new tasks from only few expert demonstrations. We formulate a bi-level optimization problem where the upper level aims to learn a meta-prior over reward functions and the lower level is to learn a meta-prior over constraints. We propose a novel algorithm to solve this problem and formally guarantee that the algorithm reaches the set of $\epsilon$-stationary points at the iteration complexity $O(\frac{1}{\epsilon^2})$. We also quantify the generalization error to an arbitrary new task. Experiments are used to validate that the learned meta-priors can adapt to new tasks with good performance from only few demonstrations.

Cite

Text

Liu and Zhu. "Meta Inverse Constrained Reinforcement Learning: Convergence Guarantee and Generalization Analysis." International Conference on Learning Representations, 2024.

Markdown

[Liu and Zhu. "Meta Inverse Constrained Reinforcement Learning: Convergence Guarantee and Generalization Analysis." International Conference on Learning Representations, 2024.](https://mlanthology.org/iclr/2024/liu2024iclr-meta/)

BibTeX

@inproceedings{liu2024iclr-meta,
  title     = {{Meta Inverse Constrained Reinforcement Learning: Convergence Guarantee and Generalization Analysis}},
  author    = {Liu, Shicheng and Zhu, Minghui},
  booktitle = {International Conference on Learning Representations},
  year      = {2024},
  url       = {https://mlanthology.org/iclr/2024/liu2024iclr-meta/}
}