Constrained Bayesian Reinforcement Learning via Approximate Linear Programming
Abstract
In this paper, we consider the safe learning scenario where we need to restrict the exploratory behavior of a reinforcement learning agent. Specifically, we treat the problem as a form of Bayesian reinforcement learning in an environment that is modeled as a constrained MDP (CMDP) where the cost function penalizes undesirable situations. We propose a model-based Bayesian reinforcement learning (BRL) algorithm for such an environment, eliciting risk-sensitive exploration in a principled way. Our algorithm efficiently solves the constrained BRL problem by approximate linear programming, and generates a finite state controller in an off-line manner. We provide theoretical guarantees and demonstrate empirically that our approach outperforms the state of the art.
Cite
Text
Lee et al. "Constrained Bayesian Reinforcement Learning via Approximate Linear Programming." International Joint Conference on Artificial Intelligence, 2017. doi:10.24963/IJCAI.2017/290Markdown
[Lee et al. "Constrained Bayesian Reinforcement Learning via Approximate Linear Programming." International Joint Conference on Artificial Intelligence, 2017.](https://mlanthology.org/ijcai/2017/lee2017ijcai-constrained/) doi:10.24963/IJCAI.2017/290BibTeX
@inproceedings{lee2017ijcai-constrained,
title = {{Constrained Bayesian Reinforcement Learning via Approximate Linear Programming}},
author = {Lee, Jongmin and Jang, Youngsoo and Poupart, Pascal and Kim, Kee-Eung},
booktitle = {International Joint Conference on Artificial Intelligence},
year = {2017},
pages = {2088-2095},
doi = {10.24963/IJCAI.2017/290},
url = {https://mlanthology.org/ijcai/2017/lee2017ijcai-constrained/}
}