Inverse Reinforcement Learning by Estimating Expertise of Demonstrators

Beliaev, Mark; Pedarsani, Ramtin

doi:10.1609/AAAI.V39I15.33705

Inverse Reinforcement Learning by Estimating Expertise of Demonstrators

Mark Beliaev, Ramtin Pedarsani

AAAI 2025 pp. 15532-15540

doi:10.1609/AAAI.V39I15.33705 /aaai/2025/beliaev2025aaai-inverse/

Abstract

In Imitation Learning (IL), utilizing suboptimal and heterogeneous demonstrations presents a substantial challenge due to the varied nature of real-world data. However, standard IL algorithms consider these datasets as homogeneous, thereby inheriting the deficiencies of suboptimal demonstrators. Previous approaches to this issue rely on impractical assumptions like high-quality data subsets, confidence rankings, or explicit environmental knowledge. This paper introduces IRLEED, *Inverse Reinforcement Learning by Estimating Expertise of Demonstrators*, a novel framework that overcomes these hurdles without prior knowledge of demonstrator expertise. IRLEED enhances existing Inverse Reinforcement Learning (IRL) algorithms by combining a general model for demonstrator suboptimality to address reward bias and action variance, with a Maximum Entropy IRL framework to efficiently derive the optimal policy from diverse, suboptimal demonstrations. Experiments in both online and offline IL settings, with simulated and human-generated data, demonstrate IRLEED's adaptability and effectiveness, making it a versatile solution for learning from suboptimal demonstrations.

PDF AAAI Semantic Scholar

Cite

Text

Beliaev and Pedarsani. "Inverse Reinforcement Learning by Estimating Expertise of Demonstrators." AAAI Conference on Artificial Intelligence, 2025. doi:10.1609/AAAI.V39I15.33705

Markdown

[Beliaev and Pedarsani. "Inverse Reinforcement Learning by Estimating Expertise of Demonstrators." AAAI Conference on Artificial Intelligence, 2025.](https://mlanthology.org/aaai/2025/beliaev2025aaai-inverse/) doi:10.1609/AAAI.V39I15.33705

BibTeX

@inproceedings{beliaev2025aaai-inverse,
  title     = {{Inverse Reinforcement Learning by Estimating Expertise of Demonstrators}},
  author    = {Beliaev, Mark and Pedarsani, Ramtin},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {2025},
  pages     = {15532-15540},
  doi       = {10.1609/AAAI.V39I15.33705},
  url       = {https://mlanthology.org/aaai/2025/beliaev2025aaai-inverse/}
}