Adaptive Choice of Grid and Time in Reinforcement Learning

Pareigis, Stephan

Adaptive Choice of Grid and Time in Reinforcement Learning

NeurIPS 1997 pp. 1036-1042

/neurips/1997/pareigis1997neurips-adaptive/

Abstract

We propose local error estimates together with algorithms for adap(cid:173) tive a-posteriori grid and time refinement in reinforcement learn(cid:173) ing. We consider a deterministic system with continuous state and time with infinite horizon discounted cost functional. For grid re(cid:173) finement we follow the procedure of numerical methods for the Bellman-equation. For time refinement we propose a new criterion, based on consistency estimates of discrete solutions of the Bellman(cid:173) equation. We demonstrate, that an optimal ratio of time to space discretization is crucial for optimal learning rates and accuracy of the approximate optimal value function.

PDF NeurIPS Semantic Scholar

Cite

Text

Pareigis. "Adaptive Choice of Grid and Time in Reinforcement Learning." Neural Information Processing Systems, 1997.

Markdown

[Pareigis. "Adaptive Choice of Grid and Time in Reinforcement Learning." Neural Information Processing Systems, 1997.](https://mlanthology.org/neurips/1997/pareigis1997neurips-adaptive/)

BibTeX

@inproceedings{pareigis1997neurips-adaptive,
  title     = {{Adaptive Choice of Grid and Time in Reinforcement Learning}},
  author    = {Pareigis, Stephan},
  booktitle = {Neural Information Processing Systems},
  year      = {1997},
  pages     = {1036-1042},
  url       = {https://mlanthology.org/neurips/1997/pareigis1997neurips-adaptive/}
}