Geometric Variance Reduction in Markov Chains: Application to Value Function and Gradient Estimation
Abstract
We study a variance reduction technique for Monte Carlo estimation of functionals in Markov chains. The method is based on designing sequential control variates using successive approximations of the function of interest V. Regular Monte Carlo estimates have a variance of O(1/N), where N is the number of sample trajectories of the Markov chain. Here, we obtain a geometric variance reduction O(ρN) (with ρV-AV, where A is an approximation operator linear in the values. Thus, if V belongs to the right approximation space (i.e. AV=V), the variance decreases geometrically to zero.
Cite
Text
Munos. "Geometric Variance Reduction in Markov Chains: Application to Value Function and Gradient Estimation." Journal of Machine Learning Research, 2006.Markdown
[Munos. "Geometric Variance Reduction in Markov Chains: Application to Value Function and Gradient Estimation." Journal of Machine Learning Research, 2006.](https://mlanthology.org/jmlr/2006/munos2006jmlr-geometric/)BibTeX
@article{munos2006jmlr-geometric,
title = {{Geometric Variance Reduction in Markov Chains: Application to Value Function and Gradient Estimation}},
author = {Munos, Rémi},
journal = {Journal of Machine Learning Research},
year = {2006},
pages = {413-427},
volume = {7},
url = {https://mlanthology.org/jmlr/2006/munos2006jmlr-geometric/}
}