Constrained Bayesian Reinforcement Learning via Approximate Linear Programming

Abstract

In this paper, we consider the safe learning scenario where we need to restrict the exploratory behavior of a reinforcement learning agent. Specifically, we treat the problem as a form of Bayesian reinforcement learning in an environment that is modeled as a constrained MDP (CMDP) where the cost function penalizes undesirable situations. We propose a model-based Bayesian reinforcement learning (BRL) algorithm for such an environment, eliciting risk-sensitive exploration in a principled way. Our algorithm efficiently solves the constrained BRL problem by approximate linear programming, and generates a finite state controller in an off-line manner. We provide theoretical guarantees and demonstrate empirically that our approach outperforms the state of the art.

Cite

Text

Lee et al. "Constrained Bayesian Reinforcement Learning via Approximate Linear Programming." International Joint Conference on Artificial Intelligence, 2017. doi:10.24963/IJCAI.2017/290

Markdown

[Lee et al. "Constrained Bayesian Reinforcement Learning via Approximate Linear Programming." International Joint Conference on Artificial Intelligence, 2017.](https://mlanthology.org/ijcai/2017/lee2017ijcai-constrained/) doi:10.24963/IJCAI.2017/290

BibTeX

@inproceedings{lee2017ijcai-constrained,
  title     = {{Constrained Bayesian Reinforcement Learning via Approximate Linear Programming}},
  author    = {Lee, Jongmin and Jang, Youngsoo and Poupart, Pascal and Kim, Kee-Eung},
  booktitle = {International Joint Conference on Artificial Intelligence},
  year      = {2017},
  pages     = {2088-2095},
  doi       = {10.24963/IJCAI.2017/290},
  url       = {https://mlanthology.org/ijcai/2017/lee2017ijcai-constrained/}
}