AMBER: An Entropy Maximizing Environment Design Algorithm for Inverse Reinforcement Learning
Abstract
In Inverse Reinforcement Learning (IRL), we learn the underlying reward function of humans from observations. Recent work shows that we can learn the reward function more accurately by observing the human in multiple related environments, but efficiently finding informative environments is an open question. We present $\texttt{AMBER}$, an information-theoretic algorithm that generates highly informative environments. With theoretical and empirical analysis, we show that $\texttt{AMBER}$ efficiently finds informative environments and improves reward learning.
Cite
Text
Nitschke et al. "AMBER: An Entropy Maximizing Environment Design Algorithm for Inverse Reinforcement Learning." ICML 2024 Workshops: MFHAIA, 2024.Markdown
[Nitschke et al. "AMBER: An Entropy Maximizing Environment Design Algorithm for Inverse Reinforcement Learning." ICML 2024 Workshops: MFHAIA, 2024.](https://mlanthology.org/icmlw/2024/nitschke2024icmlw-amber/)BibTeX
@inproceedings{nitschke2024icmlw-amber,
title = {{AMBER: An Entropy Maximizing Environment Design Algorithm for Inverse Reinforcement Learning}},
author = {Nitschke, Paul and Ankile, Lars Lien and Nofshin, Eura and Swaroop, Siddharth and Doshi-Velez, Finale and Pan, Weiwei},
booktitle = {ICML 2024 Workshops: MFHAIA},
year = {2024},
url = {https://mlanthology.org/icmlw/2024/nitschke2024icmlw-amber/}
}