VarGrad: A Low-Variance Gradient Estimator for Variational Inference
Abstract
We analyse the properties of an unbiased gradient estimator of the ELBO for variational inference, based on the score function method with leave-one-out control variates. We show that this gradient estimator can be obtained using a new loss, defined as the variance of the log-ratio between the exact posterior and the variational approximation, which we call the log-variance loss. Under certain conditions, the gradient of the log-variance loss equals the gradient of the (negative) ELBO. We show theoretically that this gradient estimator, which we call VarGrad due to its connection to the log-variance loss, exhibits lower variance than the score function method in certain settings, and that the leave-one-out control variate coefficients are close to the optimal ones. We empirically demonstrate that VarGrad offers a favourable variance versus computation trade-off compared to other state-of-the-art estimators on a discrete VAE.
Cite
Text
Richter et al. "VarGrad: A Low-Variance Gradient Estimator for Variational Inference." Neural Information Processing Systems, 2020.Markdown
[Richter et al. "VarGrad: A Low-Variance Gradient Estimator for Variational Inference." Neural Information Processing Systems, 2020.](https://mlanthology.org/neurips/2020/richter2020neurips-vargrad/)BibTeX
@inproceedings{richter2020neurips-vargrad,
title = {{VarGrad: A Low-Variance Gradient Estimator for Variational Inference}},
author = {Richter, Lorenz and Boustati, Ayman and Nüsken, Nikolas and Ruiz, Francisco and Akyildiz, Omer Deniz},
booktitle = {Neural Information Processing Systems},
year = {2020},
url = {https://mlanthology.org/neurips/2020/richter2020neurips-vargrad/}
}