Unified Sample-Optimal Property Estimation in Near-Linear Time

Abstract

We consider the fundamental learning problem of estimating properties of distributions over large domains. Using a novel piecewise-polynomial approximation technique, we derive the first unified methodology for constructing sample- and time-efficient estimators for all sufficiently smooth, symmetric and non-symmetric, additive properties. This technique yields near-linear-time computable estimators whose approximation values are asymptotically optimal and highly-concentrated, resulting in the first: 1) estimators achieving the $\mathcal{O}(k/(\varepsilon^2\log k))$ min-max $\varepsilon$-error sample complexity for all $k$-symbol Lipschitz properties; 2) unified near-optimal differentially private estimators for a variety of properties; 3) unified estimator achieving optimal bias and near-optimal variance for five important properties; 4) near-optimal sample-complexity estimators for several important symmetric properties over both domain sizes and confidence levels.

Cite

Text

Hao and Orlitsky. "Unified Sample-Optimal Property Estimation in Near-Linear Time." Neural Information Processing Systems, 2019.

Markdown

[Hao and Orlitsky. "Unified Sample-Optimal Property Estimation in Near-Linear Time." Neural Information Processing Systems, 2019.](https://mlanthology.org/neurips/2019/hao2019neurips-unified/)

BibTeX

@inproceedings{hao2019neurips-unified,
  title     = {{Unified Sample-Optimal Property Estimation in Near-Linear Time}},
  author    = {Hao, Yi and Orlitsky, Alon},
  booktitle = {Neural Information Processing Systems},
  year      = {2019},
  pages     = {11106-11116},
  url       = {https://mlanthology.org/neurips/2019/hao2019neurips-unified/}
}