Adaptive Stochastic Variance Reduction for Non-Convex Finite-Sum Minimization

Abstract

We propose an adaptive variance-reduction method, called AdaSpider, for minimization of $L$-smooth, non-convex functions with a finite-sum structure. In essence, AdaSpider combines an AdaGrad-inspired (Duchi et al., 2011), but a fairly distinct, adaptive step-size schedule with the recursive \textit{stochastic path integrated estimator} proposed in (Fang et al., 2018). To our knowledge, AdaSpider is the first parameter-free non-convex variance-reduction method in the sense that it does not require the knowledge of problem-dependent parameters, such as smoothness constant $L$, target accuracy $\epsilon$ or any bound on gradient norms. In doing so, we are able to compute an $\epsilon$-stationary point with $\tilde{O}\left(n + \sqrt{n}/\epsilon^2\right)$ oracle-calls, which matches the respective lower bound up to logarithmic factors.

Cite

Text

Kavis et al. "Adaptive Stochastic Variance Reduction for Non-Convex Finite-Sum Minimization." Neural Information Processing Systems, 2022.

Markdown

[Kavis et al. "Adaptive Stochastic Variance Reduction for Non-Convex Finite-Sum Minimization." Neural Information Processing Systems, 2022.](https://mlanthology.org/neurips/2022/kavis2022neurips-adaptive/)

BibTeX

@inproceedings{kavis2022neurips-adaptive,
  title     = {{Adaptive Stochastic Variance Reduction for Non-Convex Finite-Sum Minimization}},
  author    = {Kavis, Ali and Skoulakis, Stratis and Antonakopoulos, Kimon and Dadi, Leello Tadesse and Cevher, Volkan},
  booktitle = {Neural Information Processing Systems},
  year      = {2022},
  url       = {https://mlanthology.org/neurips/2022/kavis2022neurips-adaptive/}
}