Natasha: Faster Non-Convex Stochastic Optimization via Strongly Non-Convex Parameter
Abstract
Given a non-convex function $f(x)$ that is an average of $n$ smooth functions, we design stochastic first-order methods to find its approximate stationary points. The performance of our new methods depend on the smallest (negative) eigenvalue $-\sigma$ of the Hessian. This parameter $\sigma$ captures how strongly non-convex $f(x)$ is, and is analogous to the strong convexity parameter for convex optimization. At least in theory, our methods outperform known results for a range of parameter $\sigma$, and can also be used to find approximate local minima. Our result implies an interesting dichotomy: there exists a threshold $\sigma_0$ so that the (currently) fastest methods for $\sigma>\sigma_0$ and for $\sigma<\sigma_0$ have different behaviors: the former scales with $n^{2/3}$ and the latter scales with $n^{3/4}$.
Cite
Text
Allen-Zhu. "Natasha: Faster Non-Convex Stochastic Optimization via Strongly Non-Convex Parameter." International Conference on Machine Learning, 2017.Markdown
[Allen-Zhu. "Natasha: Faster Non-Convex Stochastic Optimization via Strongly Non-Convex Parameter." International Conference on Machine Learning, 2017.](https://mlanthology.org/icml/2017/allenzhu2017icml-natasha/)BibTeX
@inproceedings{allenzhu2017icml-natasha,
title = {{Natasha: Faster Non-Convex Stochastic Optimization via Strongly Non-Convex Parameter}},
author = {Allen-Zhu, Zeyuan},
booktitle = {International Conference on Machine Learning},
year = {2017},
pages = {89-97},
volume = {70},
url = {https://mlanthology.org/icml/2017/allenzhu2017icml-natasha/}
}