Density Estimation in Linear Time

Abstract

We consider the problem of choosing a density estimate from a set of densities F , minimizing the L1-distance to an unknown distribution. Devroye and Lugosi [DL01] analyze two algorithms for the problem: Scheffe tournament winner and minimum distance estimate. The Scheffe tournament estimate requires fewer computations than the minimum distance estimate, but has strictly weaker guarantees than the latter. We focus on the computational aspect of density estimation. We present two algorithms, both with the same guarantee as the minimum distance estimate. The first one, a modification of the minimum distance estimate, uses the same number (quadratic in |F|) of computations as the Scheffe tournament. The second one, called “efficient minimum loss-weight estimate,” uses only a linear number of computations, assuming that F is preprocessed. We then apply our algorithms to bandwidth selection for kernel estimates and bin-width selection for histogram estimates, yielding efficient procedures for these problems. We also give examples showing that the guarantees of the algorithms cannot be improved and explore randomized algorithms for density estimation.

Cite

Text

Mahalanabis and Stefankovic. "Density Estimation in Linear Time." Annual Conference on Computational Learning Theory, 2008.

Markdown

[Mahalanabis and Stefankovic. "Density Estimation in Linear Time." Annual Conference on Computational Learning Theory, 2008.](https://mlanthology.org/colt/2008/mahalanabis2008colt-density/)

BibTeX

@inproceedings{mahalanabis2008colt-density,
  title     = {{Density Estimation in Linear Time}},
  author    = {Mahalanabis, Satyaki and Stefankovic, Daniel},
  booktitle = {Annual Conference on Computational Learning Theory},
  year      = {2008},
  pages     = {503-512},
  url       = {https://mlanthology.org/colt/2008/mahalanabis2008colt-density/}
}