Adaptive Discretization for Model-Based Reinforcement Learning
Abstract
We introduce the technique of adaptive discretization to design an efficient model-based episodic reinforcement learning algorithm in large (potentially continuous) state-action spaces. Our algorithm is based on optimistic one-step value iteration extended to maintain an adaptive discretization of the space. From a theoretical perspective we provide worst-case regret bounds for our algorithm which are competitive compared to the state-of-the-art model-based algorithms. Moreover, our bounds are obtained via a modular proof technique which can potentially extend to incorporate additional structure on the problem.
Cite
Text
Sinclair et al. "Adaptive Discretization for Model-Based Reinforcement Learning." Neural Information Processing Systems, 2020.Markdown
[Sinclair et al. "Adaptive Discretization for Model-Based Reinforcement Learning." Neural Information Processing Systems, 2020.](https://mlanthology.org/neurips/2020/sinclair2020neurips-adaptive/)BibTeX
@inproceedings{sinclair2020neurips-adaptive,
title = {{Adaptive Discretization for Model-Based Reinforcement Learning}},
author = {Sinclair, Sean and Wang, Tianyu and Jain, Gauri and Banerjee, Siddhartha and Yu, Christina},
booktitle = {Neural Information Processing Systems},
year = {2020},
url = {https://mlanthology.org/neurips/2020/sinclair2020neurips-adaptive/}
}