Direct Loss Minimization for Sparse Gaussian Processes
Abstract
The paper provides a thorough investigation of Direct Loss Minimization (DLM), which optimizes the posterior to minimize predictive loss, in sparse Gaussian processes. For the conjugate case, we consider DLM for log-loss and DLM for square loss showing a significant performance improvement in both cases. The application of DLM in non-conjugate cases is more complex because the logarithm of expectation in the log-loss DLM objective is often intractable and simple sampling leads to biased estimates of gradients. The paper makes two technical contributions to address this. First, a new method using product sampling is proposed, which gives unbiased estimates of gradients (uPS) for the objective function. Second, a theoretical analysis of biased Monte Carlo estimates (bMC) shows that stochastic gradient descent converges despite the biased gradients. Experiments demonstrate empirical success of DLM. A comparison of the sampling methods shows that, while uPS is potentially more sample-efficient, bMC provides a better tradeoff in terms of convergence time and computational efficiency.
Cite
Text
Wei et al. "Direct Loss Minimization for Sparse Gaussian Processes." Artificial Intelligence and Statistics, 2021.Markdown
[Wei et al. "Direct Loss Minimization for Sparse Gaussian Processes." Artificial Intelligence and Statistics, 2021.](https://mlanthology.org/aistats/2021/wei2021aistats-direct/)BibTeX
@inproceedings{wei2021aistats-direct,
title = {{Direct Loss Minimization for Sparse Gaussian Processes}},
author = {Wei, Yadi and Sheth, Rishit and Khardon, Roni},
booktitle = {Artificial Intelligence and Statistics},
year = {2021},
pages = {2566-2574},
volume = {130},
url = {https://mlanthology.org/aistats/2021/wei2021aistats-direct/}
}