The Infinite Regionalized Policy Representation

Liu, Miao; Liao, Xuejun; Carin, Lawrence

The Infinite Regionalized Policy Representation

ICML 2011 pp. 769-776

/icml/2011/liu2011icml-infinite/

Abstract

We introduce the infinite regionalized policy presentation (iRPR), as a nonparametric policy for reinforcement learning in partially observable Markov decision processes (POMDPs). The iRPR assumes an unbounded set of decision states a priori, and infers the number of states to represent the policy given the experiences. We propose algorithms for learning the number of decision states while maintaining a proper balance between exploration and exploitation. Convergence analysis is provided, along with performance evaluations on benchmark problems.

PDF Semantic Scholar

Cite

Text

Liu et al. "The Infinite Regionalized Policy Representation." International Conference on Machine Learning, 2011.

Markdown

[Liu et al. "The Infinite Regionalized Policy Representation." International Conference on Machine Learning, 2011.](https://mlanthology.org/icml/2011/liu2011icml-infinite/)

BibTeX

@inproceedings{liu2011icml-infinite,
  title     = {{The Infinite Regionalized Policy Representation}},
  author    = {Liu, Miao and Liao, Xuejun and Carin, Lawrence},
  booktitle = {International Conference on Machine Learning},
  year      = {2011},
  pages     = {769-776},
  url       = {https://mlanthology.org/icml/2011/liu2011icml-infinite/}
}