Reward Shaping for Model-Based Bayesian Reinforcement Learning

Abstract

Bayesian reinforcement learning (BRL) provides a formal framework for optimal exploration-exploitation tradeoff in reinforcement learning. Unfortunately, it is generally intractable to find the Bayes-optimal behavior except for restricted cases. As a consequence, many BRL algorithms, model-based approaches in particular, rely on approximated models or real-time search methods. In this paper, we present potential-based shaping for improving the learning performance in model-based BRL. We propose a number of potential functions that are particularly well suited for BRL, and are domain-independent in the sense that they do not require any prior knowledge about the actual environment. By incorporating the potential function into real-time heuristic search, we show that we can significantly improve the learning performance in standard benchmark domains.

Cite

Text

Kim et al. "Reward Shaping for Model-Based Bayesian Reinforcement Learning." AAAI Conference on Artificial Intelligence, 2015. doi:10.1609/AAAI.V29I1.9702

Markdown

[Kim et al. "Reward Shaping for Model-Based Bayesian Reinforcement Learning." AAAI Conference on Artificial Intelligence, 2015.](https://mlanthology.org/aaai/2015/kim2015aaai-reward/) doi:10.1609/AAAI.V29I1.9702

BibTeX

@inproceedings{kim2015aaai-reward,
  title     = {{Reward Shaping for Model-Based Bayesian Reinforcement Learning}},
  author    = {Kim, Hyeoneun and Lim, Woosang and Lee, Kanghoon and Noh, Yung-Kyun and Kim, Kee-Eung},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {2015},
  pages     = {3548-3555},
  doi       = {10.1609/AAAI.V29I1.9702},
  url       = {https://mlanthology.org/aaai/2015/kim2015aaai-reward/}
}