Dimensionality Reduction Has Quantifiable Imperfections: Two Geometric Bounds

Abstract

In this paper, we investigate Dimensionality reduction (DR) maps in an information retrieval setting from a quantitative topology point of view. In particular, we show that no DR maps can achieve perfect precision and perfect recall simultaneously. Thus a continuous DR map must have imperfect precision. We further prove an upper bound on the precision of Lipschitz continuous DR maps. While precision is a natural measure in an information retrieval setting, it does not measure `how' wrong the retrieved data is. We therefore propose a new measure based on Wasserstein distance that comes with similar theoretical guarantee. A key technical step in our proofs is a particular optimization problem of the $L_2$-Wasserstein distance over a constrained set of distributions. We provide a complete solution to this optimization problem, which can be of independent interest on the technical side.

Cite

Text

Lui et al. "Dimensionality Reduction Has Quantifiable Imperfections: Two Geometric Bounds." Neural Information Processing Systems, 2018.

Markdown

[Lui et al. "Dimensionality Reduction Has Quantifiable Imperfections: Two Geometric Bounds." Neural Information Processing Systems, 2018.](https://mlanthology.org/neurips/2018/lui2018neurips-dimensionality/)

BibTeX

@inproceedings{lui2018neurips-dimensionality,
  title     = {{Dimensionality Reduction Has Quantifiable Imperfections: Two Geometric Bounds}},
  author    = {Lui, Kry and Ding, Gavin Weiguang and Huang, Ruitong and McCann, Robert},
  booktitle = {Neural Information Processing Systems},
  year      = {2018},
  pages     = {8453-8463},
  url       = {https://mlanthology.org/neurips/2018/lui2018neurips-dimensionality/}
}