Beyond Catoni: Sharper Rates for Heavy-Tailed and Robust Mean Estimation

Abstract

We study the fundamental problem of estimating the mean of a $d$-dimensional distribution with covariance $\Sigma \preccurlyeq \sigma^2 I_d$ given $n$ samples. When $d = 1$, \cite{catoni} showed an estimator with error $(1+o(1)) \cdot \sigma \sqrt{\frac{2 \log \frac{1}{\delta}}{n}}$, with probability $1 - \delta$, matching the Gaussian error rate. For $d>1$, a natural estimator outputs the center of the minimum enclosing ball of one-dimensional confidence intervals to achieve a $1-\delta$ confidence radius of $\sqrt{\frac{2 d}{d+1}} \cdot \sigma \left(\sqrt{\frac{d}{n}} + \sqrt{\frac{2 \log \frac{1}{\delta}}{n}}\right)$, incurring a $\sqrt{\frac{2d}{d+1}}$-factor loss over the Gaussian rate. When the $\sqrt{\frac{d}{n}}$ term dominates by a $\sqrt{\log \frac{1}{\delta}}$ factor, \cite{lee2022optimal-highdim} showed an improved estimator matching the Gaussian rate. This raises a natural question: Is the $\sqrt{\frac{2 d}{d+1}}$ loss \emph{necessary} when the $\sqrt{\frac{2 \log \frac{1}{\delta}}{n}}$ term dominates? We show that the answer is \emph{no} – we construct an estimator that improves over the above naive estimator by a constant factor. We also consider robust estimation, where an adversary is allowed to corrupt an $\epsilon$-fraction of samples arbitrarily: in this case, we show that the above strategy of combining one-dimensional estimates and incurring the $\sqrt{\frac{2d}{d+1}}$-factor \emph{is} optimal in the infinite-sample limit.

Cite

Text

Gupta et al. "Beyond Catoni: Sharper Rates for Heavy-Tailed and Robust Mean Estimation." Conference on Learning Theory, 2024.

Markdown

[Gupta et al. "Beyond Catoni: Sharper Rates for Heavy-Tailed and Robust Mean Estimation." Conference on Learning Theory, 2024.](https://mlanthology.org/colt/2024/gupta2024colt-beyond/)

BibTeX

@inproceedings{gupta2024colt-beyond,
  title     = {{Beyond Catoni: Sharper Rates for Heavy-Tailed and Robust Mean Estimation}},
  author    = {Gupta, Shivam and Hopkins, Samuel and Price, Eric},
  booktitle = {Conference on Learning Theory},
  year      = {2024},
  pages     = {2232-2269},
  volume    = {247},
  url       = {https://mlanthology.org/colt/2024/gupta2024colt-beyond/}
}