Optimal Stochastic Non-Smooth Non-Convex Optimization Through Online-to-Non-Convex Conversion
Abstract
We present new algorithms for optimizing non-smooth, non-convex stochastic objectives based on a novel analysis technique. This improves the current best-known complexity for finding a $(\delta,\epsilon)$-stationary point from $O(\epsilon^{-4}\delta^{-1})$ stochastic gradient queries to $O(\epsilon^{-3}\delta^{-1})$, which we also show to be optimal. Our primary technique is a reduction from non-smooth non-convex optimization to online learning, after which our results follow from standard regret bounds in online learning. For deterministic and second-order smooth objectives, applying more advanced optimistic online learning techniques enables a new complexity of $O(\epsilon^{-1.5}\delta^{-0.5})$. Our improved non-smooth analysis also immediately recovers all optimal or best-known results for finding $\epsilon$ stationary points of smooth or second-order smooth objectives in both stochastic and deterministic settings.
Cite
Text
Cutkosky et al. "Optimal Stochastic Non-Smooth Non-Convex Optimization Through Online-to-Non-Convex Conversion." International Conference on Machine Learning, 2023.Markdown
[Cutkosky et al. "Optimal Stochastic Non-Smooth Non-Convex Optimization Through Online-to-Non-Convex Conversion." International Conference on Machine Learning, 2023.](https://mlanthology.org/icml/2023/cutkosky2023icml-optimal/)BibTeX
@inproceedings{cutkosky2023icml-optimal,
title = {{Optimal Stochastic Non-Smooth Non-Convex Optimization Through Online-to-Non-Convex Conversion}},
author = {Cutkosky, Ashok and Mehta, Harsh and Orabona, Francesco},
booktitle = {International Conference on Machine Learning},
year = {2023},
pages = {6643-6670},
volume = {202},
url = {https://mlanthology.org/icml/2023/cutkosky2023icml-optimal/}
}