Robust Training in High Dimensions via Block Coordinate Geometric Median Descent
Abstract
Geometric median (GM) is a classical method in statistics for achieving robust estimation of the uncorrupted data; under gross corruption, it achieves the optimal breakdown point of 1/2. However, its computational complexity makes it infeasible for robustifying stochastic gradient descent (SGD) in high-dimensional optimization problems. In this paper, we show that by applying GM to only a judiciously chosen block of coordinates at a time and using a memory mechanism, one can retain the breakdown point of 1/2 for smooth non-convex problems, with non-asymptotic convergence rates comparable to the SGD with GM while resulting in significant speedup in training. We further validate the run-time and robustness of our approach empirically on several popular deep learning tasks. Code available at: https://github.com/anishacharya/BGMD
Cite
Text
Acharya et al. "Robust Training in High Dimensions via Block Coordinate Geometric Median Descent." Artificial Intelligence and Statistics, 2022.Markdown
[Acharya et al. "Robust Training in High Dimensions via Block Coordinate Geometric Median Descent." Artificial Intelligence and Statistics, 2022.](https://mlanthology.org/aistats/2022/acharya2022aistats-robust/)BibTeX
@inproceedings{acharya2022aistats-robust,
title = {{Robust Training in High Dimensions via Block Coordinate Geometric Median Descent}},
author = {Acharya, Anish and Hashemi, Abolfazl and Jain, Prateek and Sanghavi, Sujay and Dhillon, Inderjit S. and Topcu, Ufuk},
booktitle = {Artificial Intelligence and Statistics},
year = {2022},
pages = {11145-11168},
volume = {151},
url = {https://mlanthology.org/aistats/2022/acharya2022aistats-robust/}
}