Optimal Score Estimation via Empirical Bayes Smoothing
Abstract
We study the problem of estimating the score function of an unknown probability distribution $\rho^*$ from $n$ independent and identically distributed observations in $d$ dimensions. Assuming that $\rho^*$ is subgaussian and has a Lipschitz-continuous score function $s^*$, we establish the optimal rate of $\tilde \Theta(n^{-\frac{2}{d+4}})$ for this estimation problem under the loss function $\|\hat s - s^*\|^2_{L^2(\rho^*)}$ that is commonly used in the score matching literature, highlighting the curse of dimensionality where sample complexity for accurate score estimation grows exponentially with the dimension $d$. Leveraging key insights in empirical Bayes theory as well as a new convergence rate of smoothed empirical distribution in Hellinger distance, we show that a regularized score estimator based on a Gaussian kernel attains this rate, shown optimal by a matching minimax lower bound. We also discuss extensions to estimating $\beta$-Hölder continuous scores with $\beta \leq 1$, as well as the implication of our theory on the sample complexity of score-based generative models.
Cite
Text
Wibisono et al. "Optimal Score Estimation via Empirical Bayes Smoothing." Conference on Learning Theory, 2024.Markdown
[Wibisono et al. "Optimal Score Estimation via Empirical Bayes Smoothing." Conference on Learning Theory, 2024.](https://mlanthology.org/colt/2024/wibisono2024colt-optimal/)BibTeX
@inproceedings{wibisono2024colt-optimal,
title = {{Optimal Score Estimation via Empirical Bayes Smoothing}},
author = {Wibisono, Andre and Wu, Yihong and Yang, Kaylee Yingxi},
booktitle = {Conference on Learning Theory},
year = {2024},
pages = {4958-4991},
volume = {247},
url = {https://mlanthology.org/colt/2024/wibisono2024colt-optimal/}
}