Direct Loss Minimization for Sparse Gaussian Processes

Abstract

The paper provides a thorough investigation of Direct Loss Minimization (DLM), which optimizes the posterior to minimize predictive loss, in sparse Gaussian processes. For the conjugate case, we consider DLM for log-loss and DLM for square loss showing a significant performance improvement in both cases. The application of DLM in non-conjugate cases is more complex because the logarithm of expectation in the log-loss DLM objective is often intractable and simple sampling leads to biased estimates of gradients. The paper makes two technical contributions to address this. First, a new method using product sampling is proposed, which gives unbiased estimates of gradients (uPS) for the objective function. Second, a theoretical analysis of biased Monte Carlo estimates (bMC) shows that stochastic gradient descent converges despite the biased gradients. Experiments demonstrate empirical success of DLM. A comparison of the sampling methods shows that, while uPS is potentially more sample-efficient, bMC provides a better tradeoff in terms of convergence time and computational efficiency.

Cite

Text

Wei et al. "Direct Loss Minimization for Sparse Gaussian Processes." Artificial Intelligence and Statistics, 2021.

Markdown

[Wei et al. "Direct Loss Minimization for Sparse Gaussian Processes." Artificial Intelligence and Statistics, 2021.](https://mlanthology.org/aistats/2021/wei2021aistats-direct/)

BibTeX

@inproceedings{wei2021aistats-direct,
  title     = {{Direct Loss Minimization for Sparse Gaussian Processes}},
  author    = {Wei, Yadi and Sheth, Rishit and Khardon, Roni},
  booktitle = {Artificial Intelligence and Statistics},
  year      = {2021},
  pages     = {2566-2574},
  volume    = {130},
  url       = {https://mlanthology.org/aistats/2021/wei2021aistats-direct/}
}