Sketching Information Divergences

Abstract

When comparing discrete probability distributions, natural measures of similarity are not ℓ_ p distances but rather are information-divergences such as Kullback-Leibler and Hellinger. This paper considers some of the issues related to constructing small-space sketches of distributions, a concept related to dimensionality-reduction, such that these measures can be approximately computed from the sketches. Related problems for ℓ_ p distances are reasonably well understood via a series of results including Johnson, Lindenstrauss [27,18], Alon, Matias, Szegedy [1], Indyk [24], and Brinkman, Charikar [8]. In contrast, almost no analogous results are known to date about constructing sketches for the information-divergences used in statistics and learning theory.

Cite

Text

Guha et al. "Sketching Information Divergences." Annual Conference on Computational Learning Theory, 2007. doi:10.1007/978-3-540-72927-3_31

Markdown

[Guha et al. "Sketching Information Divergences." Annual Conference on Computational Learning Theory, 2007.](https://mlanthology.org/colt/2007/guha2007colt-sketching/) doi:10.1007/978-3-540-72927-3_31

BibTeX

@inproceedings{guha2007colt-sketching,
  title     = {{Sketching Information Divergences}},
  author    = {Guha, Sudipto and Indyk, Piotr and McGregor, Andrew},
  booktitle = {Annual Conference on Computational Learning Theory},
  year      = {2007},
  pages     = {424-438},
  doi       = {10.1007/978-3-540-72927-3_31},
  url       = {https://mlanthology.org/colt/2007/guha2007colt-sketching/}
}