Bootstrap Estimated Uncertainty of the Environment Model for Model-Based Reinforcement Learning

Huang, Wenzhen; Zhang, Junge; Huang, Kaiqi

doi:10.1609/AAAI.V33I01.33013870

Bootstrap Estimated Uncertainty of the Environment Model for Model-Based Reinforcement Learning

Wenzhen Huang, Junge Zhang, Kaiqi Huang

AAAI 2019 pp. 3870-3877

doi:10.1609/AAAI.V33I01.33013870 /aaai/2019/huang2019aaai-bootstrap/

Abstract

Model-based reinforcement learning (RL) methods attempt to learn a dynamics model to simulate the real environment and utilize the model to make better decisions. However, the learned environment simulator often has more or less model error which would disturb making decision and reduce performance. We propose a bootstrapped model-based RL method which bootstraps the modules in each depth of the planning tree. This method can quantify the uncertainty of environment model on different state-action pairs and lead the agent to explore the pairs with higher uncertainty to reduce the potential model errors. Moreover, we sample target values from their bootstrap distribution to connect the uncertainties at current and subsequent time-steps and introduce the prior mechanism to improve the exploration efficiency. Experiment results demonstrate that our method efficiently decreases model error and outperforms TreeQN and other stateof-the-art methods on multiple Atari games.

PDF AAAI Semantic Scholar

Cite

Text

Huang et al. "Bootstrap Estimated Uncertainty of the Environment Model for Model-Based Reinforcement Learning." AAAI Conference on Artificial Intelligence, 2019. doi:10.1609/AAAI.V33I01.33013870

Markdown

[Huang et al. "Bootstrap Estimated Uncertainty of the Environment Model for Model-Based Reinforcement Learning." AAAI Conference on Artificial Intelligence, 2019.](https://mlanthology.org/aaai/2019/huang2019aaai-bootstrap/) doi:10.1609/AAAI.V33I01.33013870

BibTeX

@inproceedings{huang2019aaai-bootstrap,
  title     = {{Bootstrap Estimated Uncertainty of the Environment Model for Model-Based Reinforcement Learning}},
  author    = {Huang, Wenzhen and Zhang, Junge and Huang, Kaiqi},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {2019},
  pages     = {3870-3877},
  doi       = {10.1609/AAAI.V33I01.33013870},
  url       = {https://mlanthology.org/aaai/2019/huang2019aaai-bootstrap/}
}