Sketching, Embedding and Dimensionality Reduction in Information Theoretic Spaces
Abstract
In this paper we show how to embed information distances like the χ^2 and Jensen-Shannon divergences efficiently in low dimensional spaces while preserving all pairwise distances. We then prove a dimensionality reduction result for the Hellinger, Jensen–Shannon, and χ^2 divergences that preserves the information geometry of the distributions, specifically, by retaining the simplex structure of the space. While our first result already implies these divergences can be explicitly embedded in the Euclidean space, retaining the simplex structure is important because it allows us to do inferences in the reduced space. We also show that these divergences can be sketched efficiently (i.e., up to a multiplicative error in sublinear space) in the aggregate streaming model. This result is exponentially stronger than known upper bounds for sketching these distances in the strict turnstile streaming model.
Cite
Text
Abdullah et al. "Sketching, Embedding and Dimensionality Reduction in Information Theoretic Spaces." International Conference on Artificial Intelligence and Statistics, 2016.Markdown
[Abdullah et al. "Sketching, Embedding and Dimensionality Reduction in Information Theoretic Spaces." International Conference on Artificial Intelligence and Statistics, 2016.](https://mlanthology.org/aistats/2016/abdullah2016aistats-sketching/)BibTeX
@inproceedings{abdullah2016aistats-sketching,
title = {{Sketching, Embedding and Dimensionality Reduction in Information Theoretic Spaces}},
author = {Abdullah, Amirali and Kumar, Ravi and McGregor, Andrew and Vassilvitskii, Sergei and Venkatasubramanian, Suresh},
booktitle = {International Conference on Artificial Intelligence and Statistics},
year = {2016},
pages = {948-956},
url = {https://mlanthology.org/aistats/2016/abdullah2016aistats-sketching/}
}