Computing Approximate $\ell_p$ Sensitivities
Abstract
Recent works in dimensionality reduction for regression tasks have introduced the notion of sensitivity, an estimate of the importance of a specific datapoint in a dataset, offering provable guarantees on the quality of the approximation after removing low-sensitivity datapoints via subsampling. However, fast algorithms for approximating sensitivities, which we show is equivalent to approximate regression, are known for only the $\ell_2$ setting, in which they are popularly termed leverage scores. In this work, we provide the first efficient algorithms for approximating $\ell_p$ sensitivities and other summary statistics of a given matrix. In particular, for a given $n \times d$ matrix, we compute $\alpha$-approximation to its $\ell_1$ sensitivities at the cost of $n/\alpha$ sensitivity computations. For estimating the total $\ell_p$ sensitivity (i.e. the sum of $\ell_p$ sensitivities), we provide an algorithm based on importance sampling of $\ell_p$ Lewis weights, which computes a constant factor approximation at the cost of roughly $\sqrt{d}$ sensitivity computations, with no polynomial dependence on $n$. Furthermore, we estimate the maximum $\ell_1$ sensitivity up to a $\sqrt{d}$ factor in $O(d)$ sensitivity computations. We also generalize these results to $\ell_p$ norms. Lastly, we experimentally show that for a wide class of structured matrices in real-world datasets, the total sensitivity can be quickly approximated and is significantly smaller than the theoretical prediction, demonstrating that real-world datasets have on average low intrinsic effective dimensionality.
Cite
Text
Padmanabhan et al. "Computing Approximate $\ell_p$ Sensitivities." Neural Information Processing Systems, 2023.Markdown
[Padmanabhan et al. "Computing Approximate $\ell_p$ Sensitivities." Neural Information Processing Systems, 2023.](https://mlanthology.org/neurips/2023/padmanabhan2023neurips-computing/)BibTeX
@inproceedings{padmanabhan2023neurips-computing,
title = {{Computing Approximate $\ell_p$ Sensitivities}},
author = {Padmanabhan, Swati and Woodruff, David and Zhang, Richard},
booktitle = {Neural Information Processing Systems},
year = {2023},
url = {https://mlanthology.org/neurips/2023/padmanabhan2023neurips-computing/}
}