Outlier-Robust Sparse Mean Estimation for Heavy-Tailed Distributions

Abstract

We study the fundamental task of outlier-robust mean estimation for heavy-tailed distributions in the presence of sparsity. Specifically, given a small number of corrupted samples from a high-dimensional heavy-tailed distribution whose mean $\mu$ is guaranteed to be sparse, the goal is to efficiently compute a hypothesis that accurately approximates $\mu$ with high probability. Prior work had obtained efficient algorithms for robust sparse mean estimation of light-tailed distributions. In this work, we give the first sample-efficient and polynomial-time robust sparse mean estimator for heavy-tailed distributions under mild moment assumptions. Our algorithm achieves the optimal asymptotic error using a number of samples scaling logarithmically with the ambient dimension. Importantly, the sample complexity of our method is optimal as a function of the failure probability $\tau$, having an {\em additive} $\log(1/\tau)$ dependence. Our algorithm leverages the stability-based approach from the algorithmic robust statistics literature, with crucial (and necessary) adaptations required in our setting. Our analysis may be of independent interest, involving the delicate design of a (non-spectral) decomposition for positive semi-definite matrices satisfying certain sparsity properties.

Cite

Text

Diakonikolas et al. "Outlier-Robust Sparse Mean Estimation for Heavy-Tailed Distributions." Neural Information Processing Systems, 2022.

Markdown

[Diakonikolas et al. "Outlier-Robust Sparse Mean Estimation for Heavy-Tailed Distributions." Neural Information Processing Systems, 2022.](https://mlanthology.org/neurips/2022/diakonikolas2022neurips-outlierrobust/)

BibTeX

@inproceedings{diakonikolas2022neurips-outlierrobust,
  title     = {{Outlier-Robust Sparse Mean Estimation for Heavy-Tailed Distributions}},
  author    = {Diakonikolas, Ilias and Kane, Daniel and Lee, Jasper and Pensia, Ankit},
  booktitle = {Neural Information Processing Systems},
  year      = {2022},
  url       = {https://mlanthology.org/neurips/2022/diakonikolas2022neurips-outlierrobust/}
}