Tighter Value Function Bounds for Bayesian Reinforcement Learning
Abstract
Bayesian reinforcement learning (BRL) provides a principled framework for optimal exploration-exploitation tradeoff in reinforcement learning. We focus on model based BRL, which involves a compact formulation of the optimal tradeoff from the Bayesian perspective. However, it still remains a computational challenge to compute the Bayes-optimal policy. In this paper, we propose a novel approach to compute tighter value function bounds of the Bayes-optimal value function, which is crucial for improving the performance of many model-based BRL algorithms. We then present how our bounds can be integrated into real-time AO* heuristic search, and provide a theoretical analysis on the impact of improved bounds on the search efficiency. We also provide empirical results on standard BRL domains that demonstrate the effectiveness of our approach.
Cite
Text
Lee and Kim. "Tighter Value Function Bounds for Bayesian Reinforcement Learning." AAAI Conference on Artificial Intelligence, 2015. doi:10.1609/AAAI.V29I1.9700Markdown
[Lee and Kim. "Tighter Value Function Bounds for Bayesian Reinforcement Learning." AAAI Conference on Artificial Intelligence, 2015.](https://mlanthology.org/aaai/2015/lee2015aaai-tighter/) doi:10.1609/AAAI.V29I1.9700BibTeX
@inproceedings{lee2015aaai-tighter,
title = {{Tighter Value Function Bounds for Bayesian Reinforcement Learning}},
author = {Lee, Kanghoon and Kim, Kee-Eung},
booktitle = {AAAI Conference on Artificial Intelligence},
year = {2015},
pages = {3556-3563},
doi = {10.1609/AAAI.V29I1.9700},
url = {https://mlanthology.org/aaai/2015/lee2015aaai-tighter/}
}