Tighter Analysis for ProxSkip

Abstract

In this paper, we provide a tighter analysis for ProxSkip, an algorithm that allows fewer proximal operator computations to solve composite optimization problems. We improve the existing decreasing speed of Lyapunov function from $\mathcal{O}(p^2)$ to $\mathcal{O}(p)$, when $p$, the frequency of the proximal operators is small enough. Our theoretical analysis also reveals the drawbacks of using large step sizes for gradient descent in ProxSkip when the proximal operator part is the bottleneck. Our main motivation comes from the continuous limit in which the original analysis of ProxSkip fails to guarantee convergence when both the step size $\gamma$ and frequency $p$ tend to zero. We construct a counterexample to demonstrate why such counterintuitive behavior occurs for the original analysis and then propose a novel Lyapunov function variant to construct a tighter analysis, avoiding the problem of the old one. Such a new Lyapunov function can be directly extended to many other variants of ProxSkip. When applied to stochastic gradient setup, our analysis leads to an improved proximal operator complexity for SProxSkip from $\mathcal{O}(\sqrt{\frac{1}{\varepsilon\mu^2}}\log(\frac{1}{\varepsilon}))$ to $\mathcal{O}(\sqrt{\kappa}\log(\frac{1}{\varepsilon}))$.

Cite

Text

Hu and Huang. "Tighter Analysis for ProxSkip." International Conference on Machine Learning, 2023.

Markdown

[Hu and Huang. "Tighter Analysis for ProxSkip." International Conference on Machine Learning, 2023.](https://mlanthology.org/icml/2023/hu2023icml-tighter/)

BibTeX

@inproceedings{hu2023icml-tighter,
  title     = {{Tighter Analysis for ProxSkip}},
  author    = {Hu, Zhengmian and Huang, Heng},
  booktitle = {International Conference on Machine Learning},
  year      = {2023},
  pages     = {13469-13496},
  volume    = {202},
  url       = {https://mlanthology.org/icml/2023/hu2023icml-tighter/}
}