Constructing Basis Functions from Directed Graphs for Value Function Approximation

Abstract

Basis functions derived from an undirected graph connecting nearby samples from a Markov decision process (MDP) have proven useful for approximating value functions. The success of this technique is attributed to the smoothness of the basis functions with respect to the state space geometry. This paper explores the properties of bases created from directed graphs which are a more natural fit for expressing state connectivity. Digraphs capture the effect of non-reversible MDPs whose value functions may not be smooth across adjacent states. We provide an analysis using the Dirichlet sum of the directed graph Laplacian to show how the smoothness of the basis functions is affected by the graph's invariant distribution. Experiments in discrete and continuous MDPs with nonreversible actions demonstrate a significant improvement in the policies learned using directed graph bases.

Cite

Text

Johns and Mahadevan. "Constructing Basis Functions from Directed Graphs for Value Function Approximation." International Conference on Machine Learning, 2007. doi:10.1145/1273496.1273545

Markdown

[Johns and Mahadevan. "Constructing Basis Functions from Directed Graphs for Value Function Approximation." International Conference on Machine Learning, 2007.](https://mlanthology.org/icml/2007/johns2007icml-constructing/) doi:10.1145/1273496.1273545

BibTeX

@inproceedings{johns2007icml-constructing,
  title     = {{Constructing Basis Functions from Directed Graphs for Value Function Approximation}},
  author    = {Johns, Jeffrey and Mahadevan, Sridhar},
  booktitle = {International Conference on Machine Learning},
  year      = {2007},
  pages     = {385-392},
  doi       = {10.1145/1273496.1273545},
  url       = {https://mlanthology.org/icml/2007/johns2007icml-constructing/}
}