Barycentric Interpolators for Continuous Space and Time Reinforcement Learning

Abstract

In order to find the optimal control of continuous state-space and time reinforcement learning (RL) problems, we approximate the value function (VF) with a particular class of functions called the barycentric interpolators. We establish sufficient conditions under which a RL algorithm converges to the optimal VF, even when we use approximate models of the state dynamics and the reinforce(cid:173) ment functions .

Cite

Text

Munos and Moore. "Barycentric Interpolators for Continuous Space and Time Reinforcement Learning." Neural Information Processing Systems, 1998.

Markdown

[Munos and Moore. "Barycentric Interpolators for Continuous Space and Time Reinforcement Learning." Neural Information Processing Systems, 1998.](https://mlanthology.org/neurips/1998/munos1998neurips-barycentric/)

BibTeX

@inproceedings{munos1998neurips-barycentric,
  title     = {{Barycentric Interpolators for Continuous Space and Time Reinforcement Learning}},
  author    = {Munos, Rémi and Moore, Andrew W.},
  booktitle = {Neural Information Processing Systems},
  year      = {1998},
  pages     = {1024-1030},
  url       = {https://mlanthology.org/neurips/1998/munos1998neurips-barycentric/}
}