Repeated Inverse Reinforcement Learning
Abstract
We introduce a novel repeated Inverse Reinforcement Learning problem: the agent has to act on behalf of a human in a sequence of tasks and wishes to minimize the number of tasks that it surprises the human by acting suboptimally with respect to how the human would have acted. Each time the human is surprised, the agent is provided a demonstration of the desired behavior by the human. We formalize this problem, including how the sequence of tasks is chosen, in a few different ways and provide some foundational results.
Cite
Text
Amin et al. "Repeated Inverse Reinforcement Learning." Neural Information Processing Systems, 2017.Markdown
[Amin et al. "Repeated Inverse Reinforcement Learning." Neural Information Processing Systems, 2017.](https://mlanthology.org/neurips/2017/amin2017neurips-repeated/)BibTeX
@inproceedings{amin2017neurips-repeated,
title = {{Repeated Inverse Reinforcement Learning}},
author = {Amin, Kareem and Jiang, Nan and Singh, Satinder},
booktitle = {Neural Information Processing Systems},
year = {2017},
pages = {1815-1824},
url = {https://mlanthology.org/neurips/2017/amin2017neurips-repeated/}
}