Sketching Information Divergences
Abstract
When comparing discrete probability distributions, natural measures of similarity are not ℓ_ p distances but rather are information-divergences such as Kullback-Leibler and Hellinger. This paper considers some of the issues related to constructing small-space sketches of distributions, a concept related to dimensionality-reduction, such that these measures can be approximately computed from the sketches. Related problems for ℓ_ p distances are reasonably well understood via a series of results including Johnson, Lindenstrauss [27,18], Alon, Matias, Szegedy [1], Indyk [24], and Brinkman, Charikar [8]. In contrast, almost no analogous results are known to date about constructing sketches for the information-divergences used in statistics and learning theory.
Cite
Text
Guha et al. "Sketching Information Divergences." Annual Conference on Computational Learning Theory, 2007. doi:10.1007/978-3-540-72927-3_31Markdown
[Guha et al. "Sketching Information Divergences." Annual Conference on Computational Learning Theory, 2007.](https://mlanthology.org/colt/2007/guha2007colt-sketching/) doi:10.1007/978-3-540-72927-3_31BibTeX
@inproceedings{guha2007colt-sketching,
title = {{Sketching Information Divergences}},
author = {Guha, Sudipto and Indyk, Piotr and McGregor, Andrew},
booktitle = {Annual Conference on Computational Learning Theory},
year = {2007},
pages = {424-438},
doi = {10.1007/978-3-540-72927-3_31},
url = {https://mlanthology.org/colt/2007/guha2007colt-sketching/}
}