SURF: A Simple, Universal, Robust, Fast Distribution Learning Algorithm

Yi Hao, Ayush Jain, Alon Orlitsky, Vaishakh Ravindrakumar

NeurIPS 2020

/neurips/2020/hao2020neurips-surf/

Abstract

Sample- and computationally-efficient distribution estimation is a fundamental tenet in statistics and machine learning. We present $\SURF$, an algorithm for approximating distributions by piecewise polynomials. $\SURF$ is: simple, replacing prior complex optimization techniques by straight-forward empirical probability approximation of each potential polynomial piece through simple empirical-probability interpolation, and using plain divide-and-conquer to merge the pieces; universal, as well-known polynomial-approximation results imply that it accurately approximates a large class of common distributions; robust to distribution mis-specification as for any degree $d \le 8$, it estimates any distribution to an $\ell_1$ distance $< 3$ times that of the nearest degree-$d$ piecewise polynomial, improving known factor upper bounds of 3 for single polynomials and 15 for polynomials with arbitrarily many pieces; fast, using optimal sample complexity, running in near sample-linear time, and if given sorted samples it may be parallelized to run in sub-linear time. In experiments, $\SURF$ outperforms state-of-the art algorithms.

PDF NeurIPS Semantic Scholar

Cite

Text

Hao et al. "SURF: A Simple, Universal, Robust, Fast Distribution Learning Algorithm." Neural Information Processing Systems, 2020.

Markdown

[Hao et al. "SURF: A Simple, Universal, Robust, Fast Distribution Learning Algorithm." Neural Information Processing Systems, 2020.](https://mlanthology.org/neurips/2020/hao2020neurips-surf/)

BibTeX

@inproceedings{hao2020neurips-surf,
  title     = {{SURF: A Simple, Universal, Robust, Fast Distribution Learning Algorithm}},
  author    = {Hao, Yi and Jain, Ayush and Orlitsky, Alon and Ravindrakumar, Vaishakh},
  booktitle = {Neural Information Processing Systems},
  year      = {2020},
  url       = {https://mlanthology.org/neurips/2020/hao2020neurips-surf/}
}