Bayes-Adaptive Simulation-Based Search with Value Function Approximation

Abstract

Bayes-adaptive planning offers a principled solution to the exploration-exploitation trade-off under model uncertainty. It finds the optimal policy in belief space, which explicitly accounts for the expected effect on future rewards of reductions in uncertainty. However, the Bayes-adaptive solution is typically intractable in domains with large or continuous state spaces. We present a tractable method for approximating the Bayes-adaptive solution by combining simulation-based search with a novel value function approximation technique that generalises over belief space. Our method outperforms prior approaches in both discrete bandit tasks and simple continuous navigation and control tasks.

Cite

Text

Guez et al. "Bayes-Adaptive Simulation-Based Search with Value Function Approximation." Neural Information Processing Systems, 2014.

Markdown

[Guez et al. "Bayes-Adaptive Simulation-Based Search with Value Function Approximation." Neural Information Processing Systems, 2014.](https://mlanthology.org/neurips/2014/guez2014neurips-bayesadaptive/)

BibTeX

@inproceedings{guez2014neurips-bayesadaptive,
  title     = {{Bayes-Adaptive Simulation-Based Search with Value Function Approximation}},
  author    = {Guez, Arthur and Heess, Nicolas and Silver, David and Dayan, Peter},
  booktitle = {Neural Information Processing Systems},
  year      = {2014},
  pages     = {451-459},
  url       = {https://mlanthology.org/neurips/2014/guez2014neurips-bayesadaptive/}
}