Kernel Stein Discrepancy Descent

Abstract

Among dissimilarities between probability distributions, the Kernel Stein Discrepancy (KSD) has received much interest recently. We investigate the properties of its Wasserstein gradient flow to approximate a target probability distribution $\pi$ on $\mathbb{R}^d$, known up to a normalization constant. This leads to a straightforwardly implementable, deterministic score-based method to sample from $\pi$, named KSD Descent, which uses a set of particles to approximate $\pi$. Remarkably, owing to a tractable loss function, KSD Descent can leverage robust parameter-free optimization schemes such as L-BFGS; this contrasts with other popular particle-based schemes such as the Stein Variational Gradient Descent algorithm. We study the convergence properties of KSD Descent and demonstrate its practical relevance. However, we also highlight failure cases by showing that the algorithm can get stuck in spurious local minima.

Cite

Text

Korba et al. "Kernel Stein Discrepancy Descent." International Conference on Machine Learning, 2021.

Markdown

[Korba et al. "Kernel Stein Discrepancy Descent." International Conference on Machine Learning, 2021.](https://mlanthology.org/icml/2021/korba2021icml-kernel/)

BibTeX

@inproceedings{korba2021icml-kernel,
  title     = {{Kernel Stein Discrepancy Descent}},
  author    = {Korba, Anna and Aubin-Frankowski, Pierre-Cyril and Majewski, Szymon and Ablin, Pierre},
  booktitle = {International Conference on Machine Learning},
  year      = {2021},
  pages     = {5719-5730},
  volume    = {139},
  url       = {https://mlanthology.org/icml/2021/korba2021icml-kernel/}
}