Structure in the Space of Value Functions

Foster, David J.; Dayan, Peter

doi:10.1023/A:1017944732463

Structure in the Space of Value Functions

David J. Foster, Peter Dayan

MLJ 2002 pp. 325-346

doi:10.1023/A:1017944732463 /mlj/2002/foster2002mlj-structure/

Abstract

Solving in an efficient manner many different optimal control tasks within the same underlying environment requires decomposing the environment into its computationally elemental fragments. We suggest how to find fragmentations using unsupervised, mixture model, learning methods on data derived from optimal value functions for multiple tasks, and show that these fragmentations are in accord with observable structure in the environments. Further, we present evidence that such fragments can be of use in a practical reinforcement learning context, by facilitating online, actor-critic learning of multiple goals MDPs.

PDF MLJ Semantic Scholar

Cite

Text

Foster and Dayan. "Structure in the Space of Value Functions." Machine Learning, 2002. doi:10.1023/A:1017944732463

Markdown

[Foster and Dayan. "Structure in the Space of Value Functions." Machine Learning, 2002.](https://mlanthology.org/mlj/2002/foster2002mlj-structure/) doi:10.1023/A:1017944732463

BibTeX

@article{foster2002mlj-structure,
  title     = {{Structure in the Space of Value Functions}},
  author    = {Foster, David J. and Dayan, Peter},
  journal   = {Machine Learning},
  year      = {2002},
  pages     = {325-346},
  doi       = {10.1023/A:1017944732463},
  volume    = {49},
  url       = {https://mlanthology.org/mlj/2002/foster2002mlj-structure/}
}