Multidimensional Triangulation and Interpolation for Reinforcement Learning
Abstract
Dynamic Programming, Q-Iearning and other discrete Markov Decision Process solvers can be -applied to continuous d-dimensional state-spaces by quantizing the state space into an array of boxes. This is often problematic above two dimensions: a coarse quantization can lead to poor policies, and fine quantization is too expensive. Possible solutions are variable-resolution discretization, or function approximation by neural nets. A third option, which has been little studied in the reinforcement learning literature, is interpolation on a coarse grid. In this paper we study interpolation tech(cid:173) niques that can result in vast improvements in the online behavior of the resulting control systems: multilinear interpolation, and an interpolation algorithm based on an interesting regular triangulation of d-dimensional space. We adapt these interpolators under three reinforcement learning paradigms: (i) offline value iteration with a known model, (ii) Q-Iearning, and (iii) online value iteration with a previously unknown model learned from data. We describe empirical results, and the resulting implications for practical learning of continuous non-linear dynamic control.
Cite
Text
Davies. "Multidimensional Triangulation and Interpolation for Reinforcement Learning." Neural Information Processing Systems, 1996.Markdown
[Davies. "Multidimensional Triangulation and Interpolation for Reinforcement Learning." Neural Information Processing Systems, 1996.](https://mlanthology.org/neurips/1996/davies1996neurips-multidimensional/)BibTeX
@inproceedings{davies1996neurips-multidimensional,
title = {{Multidimensional Triangulation and Interpolation for Reinforcement Learning}},
author = {Davies, Scott},
booktitle = {Neural Information Processing Systems},
year = {1996},
pages = {1005-1011},
url = {https://mlanthology.org/neurips/1996/davies1996neurips-multidimensional/}
}