Geometric Variance Reduction in Markov Chains: Application to Value Function and Gradient Estimation

Munos, Rémi

Geometric Variance Reduction in Markov Chains: Application to Value Function and Gradient Estimation

JMLR 2006 pp. 413-427

/jmlr/2006/munos2006jmlr-geometric/

Abstract

We study a variance reduction technique for Monte Carlo estimation of functionals in Markov chains. The method is based on designing sequential control variates using successive approximations of the function of interest V. Regular Monte Carlo estimates have a variance of O(1/N), where N is the number of sample trajectories of the Markov chain. Here, we obtain a geometric variance reduction O(ρN) (with ρV-AV, where A is an approximation operator linear in the values. Thus, if V belongs to the right approximation space (i.e. AV=V), the variance decreases geometrically to zero.

PDF JMLR Semantic Scholar

Cite

Text

Munos. "Geometric Variance Reduction in Markov Chains: Application to Value Function and Gradient Estimation." Journal of Machine Learning Research, 2006.

Markdown

[Munos. "Geometric Variance Reduction in Markov Chains: Application to Value Function and Gradient Estimation." Journal of Machine Learning Research, 2006.](https://mlanthology.org/jmlr/2006/munos2006jmlr-geometric/)

BibTeX

@article{munos2006jmlr-geometric,
  title     = {{Geometric Variance Reduction in Markov Chains: Application to Value Function and Gradient Estimation}},
  author    = {Munos, Rémi},
  journal   = {Journal of Machine Learning Research},
  year      = {2006},
  pages     = {413-427},
  volume    = {7},
  url       = {https://mlanthology.org/jmlr/2006/munos2006jmlr-geometric/}
}