Optimal Stochastic Non-Smooth Non-Convex Optimization Through Online-to-Non-Convex Conversion

Abstract

We present new algorithms for optimizing non-smooth, non-convex stochastic objectives based on a novel analysis technique. This improves the current best-known complexity for finding a $(\delta,\epsilon)$-stationary point from $O(\epsilon^{-4}\delta^{-1})$ stochastic gradient queries to $O(\epsilon^{-3}\delta^{-1})$, which we also show to be optimal. Our primary technique is a reduction from non-smooth non-convex optimization to online learning, after which our results follow from standard regret bounds in online learning. For deterministic and second-order smooth objectives, applying more advanced optimistic online learning techniques enables a new complexity of $O(\epsilon^{-1.5}\delta^{-0.5})$. Our improved non-smooth analysis also immediately recovers all optimal or best-known results for finding $\epsilon$ stationary points of smooth or second-order smooth objectives in both stochastic and deterministic settings.

Cite

Text

Cutkosky et al. "Optimal Stochastic Non-Smooth Non-Convex Optimization Through Online-to-Non-Convex Conversion." International Conference on Machine Learning, 2023.

Markdown

[Cutkosky et al. "Optimal Stochastic Non-Smooth Non-Convex Optimization Through Online-to-Non-Convex Conversion." International Conference on Machine Learning, 2023.](https://mlanthology.org/icml/2023/cutkosky2023icml-optimal/)

BibTeX

@inproceedings{cutkosky2023icml-optimal,
  title     = {{Optimal Stochastic Non-Smooth Non-Convex Optimization Through Online-to-Non-Convex Conversion}},
  author    = {Cutkosky, Ashok and Mehta, Harsh and Orabona, Francesco},
  booktitle = {International Conference on Machine Learning},
  year      = {2023},
  pages     = {6643-6670},
  volume    = {202},
  url       = {https://mlanthology.org/icml/2023/cutkosky2023icml-optimal/}
}