Reward Shaping for Model-Based Bayesian Reinforcement Learning
Abstract
Bayesian reinforcement learning (BRL) provides a formal framework for optimal exploration-exploitation tradeoff in reinforcement learning. Unfortunately, it is generally intractable to find the Bayes-optimal behavior except for restricted cases. As a consequence, many BRL algorithms, model-based approaches in particular, rely on approximated models or real-time search methods. In this paper, we present potential-based shaping for improving the learning performance in model-based BRL. We propose a number of potential functions that are particularly well suited for BRL, and are domain-independent in the sense that they do not require any prior knowledge about the actual environment. By incorporating the potential function into real-time heuristic search, we show that we can significantly improve the learning performance in standard benchmark domains.
Cite
Text
Kim et al. "Reward Shaping for Model-Based Bayesian Reinforcement Learning." AAAI Conference on Artificial Intelligence, 2015. doi:10.1609/AAAI.V29I1.9702Markdown
[Kim et al. "Reward Shaping for Model-Based Bayesian Reinforcement Learning." AAAI Conference on Artificial Intelligence, 2015.](https://mlanthology.org/aaai/2015/kim2015aaai-reward/) doi:10.1609/AAAI.V29I1.9702BibTeX
@inproceedings{kim2015aaai-reward,
title = {{Reward Shaping for Model-Based Bayesian Reinforcement Learning}},
author = {Kim, Hyeoneun and Lim, Woosang and Lee, Kanghoon and Noh, Yung-Kyun and Kim, Kee-Eung},
booktitle = {AAAI Conference on Artificial Intelligence},
year = {2015},
pages = {3548-3555},
doi = {10.1609/AAAI.V29I1.9702},
url = {https://mlanthology.org/aaai/2015/kim2015aaai-reward/}
}