Proto-Value Functions: Developmental Reinforcement Learning

Abstract

This paper presents a novel framework called proto-reinforcement learning (PRL), based on a mathematical model of a proto-value function: these are task-independent basis functions that form the building blocks of all value functions on a given state space manifold. Proto-value functions are learned not from rewards, but instead from analyzing the topology of the state space. Formally, proto-value functions are Fourier eigenfunctions of the Laplace-Beltrami diffusion operator on the state space manifold. Proto-value functions facilitate structural decomposition of large state spaces, and form geodesically smooth orthonormal basis functions for approximating any value function. The theoretical basis for proto-value functions combines insights from spectral graph theory, harmonic analysis, and Riemannian manifolds. Proto-value functions enable a novel generation of algorithms called representation policy iteration, unifying the learning of representation and behavior.

Cite

Text

Mahadevan. "Proto-Value Functions: Developmental Reinforcement Learning." International Conference on Machine Learning, 2005. doi:10.1145/1102351.1102421

Markdown

[Mahadevan. "Proto-Value Functions: Developmental Reinforcement Learning." International Conference on Machine Learning, 2005.](https://mlanthology.org/icml/2005/mahadevan2005icml-proto/) doi:10.1145/1102351.1102421

BibTeX

@inproceedings{mahadevan2005icml-proto,
  title     = {{Proto-Value Functions: Developmental Reinforcement Learning}},
  author    = {Mahadevan, Sridhar},
  booktitle = {International Conference on Machine Learning},
  year      = {2005},
  pages     = {553-560},
  doi       = {10.1145/1102351.1102421},
  url       = {https://mlanthology.org/icml/2005/mahadevan2005icml-proto/}
}