Adaptive Discretization for Model-Based Reinforcement Learning

Abstract

We introduce the technique of adaptive discretization to design an efficient model-based episodic reinforcement learning algorithm in large (potentially continuous) state-action spaces. Our algorithm is based on optimistic one-step value iteration extended to maintain an adaptive discretization of the space. From a theoretical perspective we provide worst-case regret bounds for our algorithm which are competitive compared to the state-of-the-art model-based algorithms. Moreover, our bounds are obtained via a modular proof technique which can potentially extend to incorporate additional structure on the problem.

Cite

Text

Sinclair et al. "Adaptive Discretization for Model-Based Reinforcement Learning." Neural Information Processing Systems, 2020.

Markdown

[Sinclair et al. "Adaptive Discretization for Model-Based Reinforcement Learning." Neural Information Processing Systems, 2020.](https://mlanthology.org/neurips/2020/sinclair2020neurips-adaptive/)

BibTeX

@inproceedings{sinclair2020neurips-adaptive,
  title     = {{Adaptive Discretization for Model-Based Reinforcement Learning}},
  author    = {Sinclair, Sean and Wang, Tianyu and Jain, Gauri and Banerjee, Siddhartha and Yu, Christina},
  booktitle = {Neural Information Processing Systems},
  year      = {2020},
  url       = {https://mlanthology.org/neurips/2020/sinclair2020neurips-adaptive/}
}