Robust Estimation Under Heterogeneous Corruption Rates

Abstract

We study the problem of robust estimation under heterogeneous corruption rates, where each sample may be independently corrupted with a known but non-identical probability. This setting arises naturally in distributed and federated learning, crowdsourcing, and sensor networks, yet existing robust estimators typically assume uniform or worst-case corruption, ignoring structural heterogeneity. For mean estimation for multivariate bounded distributions and univariate gaussian distributions, we give tight minimax rates for all heterogeneous corruption patterns. For multivariate gaussian mean estimation and linear regression, we establish the minimax rate for squared error up to a factor of $\sqrt{d}$, where $d$ is the dimension. Roughly, our findings suggest that samples beyond a certain corruption threshold may be discarded by the optimal estimators -- this threshold is determined by the empirical distribution of the corruption rates given.

Cite

Text

Chaudhuri et al. "Robust Estimation Under Heterogeneous Corruption Rates." Advances in Neural Information Processing Systems, 2025.

Markdown

[Chaudhuri et al. "Robust Estimation Under Heterogeneous Corruption Rates." Advances in Neural Information Processing Systems, 2025.](https://mlanthology.org/neurips/2025/chaudhuri2025neurips-robust/)

BibTeX

@inproceedings{chaudhuri2025neurips-robust,
  title     = {{Robust Estimation Under Heterogeneous Corruption Rates}},
  author    = {Chaudhuri, Syomantak and Li, Jerry and Courtade, Thomas},
  booktitle = {Advances in Neural Information Processing Systems},
  year      = {2025},
  url       = {https://mlanthology.org/neurips/2025/chaudhuri2025neurips-robust/}
}