Limitations on Variance-Reduction and Acceleration Schemes for Finite Sums Optimization

Abstract

We study the conditions under which one is able to efficiently apply variance-reduction and acceleration schemes on finite sums problems. First, we show that perhaps surprisingly, the finite sum structure, by itself, is not sufficient for obtaining a complexity bound of $\tilde{\cO}((n+L/\mu)\ln(1/\epsilon))$ for $L$-smooth and $\mu$-strongly convex finite sums - one must also know exactly which individual function is being referred to by the oracle at each iteration. Next, we show that for a broad class of first-order and coordinate-descent finite sums algorithms (including, e.g., SDCA, SVRG, SAG), it is not possible to get an `accelerated' complexity bound of $\tilde{\cO}((n+\sqrt{n L/\mu})\ln(1/\epsilon))$, unless the strong convexity parameter is given explicitly. Lastly, we show that when this class of algorithms is used for minimizing $L$-smooth and non-strongly convex finite sums, the optimal complexity bound is $\tilde{\cO}(n+L/\epsilon)$, assuming that (on average) the same update rule is used for any iteration, and $\tilde{\cO}(n+\sqrt{nL/\epsilon})$, otherwise.

Cite

Text

Arjevani. "Limitations on Variance-Reduction and Acceleration Schemes for Finite Sums Optimization." Neural Information Processing Systems, 2017.

Markdown

[Arjevani. "Limitations on Variance-Reduction and Acceleration Schemes for Finite Sums Optimization." Neural Information Processing Systems, 2017.](https://mlanthology.org/neurips/2017/arjevani2017neurips-limitations/)

BibTeX

@inproceedings{arjevani2017neurips-limitations,
  title     = {{Limitations on Variance-Reduction and Acceleration Schemes for Finite Sums Optimization}},
  author    = {Arjevani, Yossi},
  booktitle = {Neural Information Processing Systems},
  year      = {2017},
  pages     = {3540-3549},
  url       = {https://mlanthology.org/neurips/2017/arjevani2017neurips-limitations/}
}