Inverse Reinforcement Learning by Estimating Expertise of Demonstrators

Abstract

In Imitation Learning (IL), utilizing suboptimal and heterogeneous demonstrations presents a substantial challenge due to the varied nature of real-world data. However, standard IL algorithms consider these datasets as homogeneous, thereby inheriting the deficiencies of suboptimal demonstrators. Previous approaches to this issue rely on impractical assumptions like high-quality data subsets, confidence rankings, or explicit environmental knowledge. This paper introduces IRLEED, *Inverse Reinforcement Learning by Estimating Expertise of Demonstrators*, a novel framework that overcomes these hurdles without prior knowledge of demonstrator expertise. IRLEED enhances existing Inverse Reinforcement Learning (IRL) algorithms by combining a general model for demonstrator suboptimality to address reward bias and action variance, with a Maximum Entropy IRL framework to efficiently derive the optimal policy from diverse, suboptimal demonstrations. Experiments in both online and offline IL settings, with simulated and human-generated data, demonstrate IRLEED's adaptability and effectiveness, making it a versatile solution for learning from suboptimal demonstrations.

Cite

Text

Beliaev and Pedarsani. "Inverse Reinforcement Learning by Estimating Expertise of Demonstrators." AAAI Conference on Artificial Intelligence, 2025. doi:10.1609/AAAI.V39I15.33705

Markdown

[Beliaev and Pedarsani. "Inverse Reinforcement Learning by Estimating Expertise of Demonstrators." AAAI Conference on Artificial Intelligence, 2025.](https://mlanthology.org/aaai/2025/beliaev2025aaai-inverse/) doi:10.1609/AAAI.V39I15.33705

BibTeX

@inproceedings{beliaev2025aaai-inverse,
  title     = {{Inverse Reinforcement Learning by Estimating Expertise of Demonstrators}},
  author    = {Beliaev, Mark and Pedarsani, Ramtin},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {2025},
  pages     = {15532-15540},
  doi       = {10.1609/AAAI.V39I15.33705},
  url       = {https://mlanthology.org/aaai/2025/beliaev2025aaai-inverse/}
}