DiffRed: Dimensionality Reduction Guided by Stable Rank

Abstract

In this work, we propose a novel dimensionality reduction technique, \textit{DiffRed}, which first projects the data matrix, A, along first $k_1$ principal components and the residual matrix $A^{*}$ (left after subtracting its $k_1$-rank approximation) along $k_2$ Gaussian random vectors. We evaluate \emph{M1}, the distortion of mean-squared pair-wise distance, and \emph{Stress}, the normalized value of RMS of distortion of the pairwise distances. We rigorously prove that \textit{DiffRed} achieves a general upper bound of $O\left(\sqrt{\frac{1-p}{k_2}}\right)$ on \emph{Stress} and $O\left(\frac{1-p}{\sqrt{k_2*\rho(A^{*})}}\right)$ on \emph{M1} where $p$ is the fraction of variance explained by the first $k_1$ principal components and $\rho(A^{*})$ is the \textit{stable rank} of $A^{*}$. These bounds are tighter than the currently known results for Random maps. Our extensive experiments on a variety of real-world datasets demonstrate that \textit{DiffRed} achieves near zero \emph{M1} and much lower values of \emph{Stress} as compared to the well-known dimensionality reduction techniques. In particular, \textit{DiffRed} can map a 6 million dimensional dataset to 10 dimensions with 54% lower \emph{Stress} than PCA.

Cite

Text

Shukla et al. "DiffRed: Dimensionality Reduction Guided by Stable Rank." Artificial Intelligence and Statistics, 2024.

Markdown

[Shukla et al. "DiffRed: Dimensionality Reduction Guided by Stable Rank." Artificial Intelligence and Statistics, 2024.](https://mlanthology.org/aistats/2024/shukla2024aistats-diffred/)

BibTeX

@inproceedings{shukla2024aistats-diffred,
  title     = {{DiffRed: Dimensionality Reduction Guided by Stable Rank}},
  author    = {Shukla, Prarabdh and Raj Gupta, Gagan and Dutta, Kunal},
  booktitle = {Artificial Intelligence and Statistics},
  year      = {2024},
  pages     = {3430-3438},
  volume    = {238},
  url       = {https://mlanthology.org/aistats/2024/shukla2024aistats-diffred/}
}