Optimal Score Estimation via Empirical Bayes Smoothing

Abstract

We study the problem of estimating the score function of an unknown probability distribution $\rho^*$ from $n$ independent and identically distributed observations in $d$ dimensions. Assuming that $\rho^*$ is subgaussian and has a Lipschitz-continuous score function $s^*$, we establish the optimal rate of $\tilde \Theta(n^{-\frac{2}{d+4}})$ for this estimation problem under the loss function $\|\hat s - s^*\|^2_{L^2(\rho^*)}$ that is commonly used in the score matching literature, highlighting the curse of dimensionality where sample complexity for accurate score estimation grows exponentially with the dimension $d$. Leveraging key insights in empirical Bayes theory as well as a new convergence rate of smoothed empirical distribution in Hellinger distance, we show that a regularized score estimator based on a Gaussian kernel attains this rate, shown optimal by a matching minimax lower bound. We also discuss extensions to estimating $\beta$-Hölder continuous scores with $\beta \leq 1$, as well as the implication of our theory on the sample complexity of score-based generative models.

Cite

Text

Wibisono et al. "Optimal Score Estimation via Empirical Bayes Smoothing." Conference on Learning Theory, 2024.

Markdown

[Wibisono et al. "Optimal Score Estimation via Empirical Bayes Smoothing." Conference on Learning Theory, 2024.](https://mlanthology.org/colt/2024/wibisono2024colt-optimal/)

BibTeX

@inproceedings{wibisono2024colt-optimal,
  title     = {{Optimal Score Estimation via Empirical Bayes Smoothing}},
  author    = {Wibisono, Andre and Wu, Yihong and Yang, Kaylee Yingxi},
  booktitle = {Conference on Learning Theory},
  year      = {2024},
  pages     = {4958-4991},
  volume    = {247},
  url       = {https://mlanthology.org/colt/2024/wibisono2024colt-optimal/}
}