Accelerating Inexact HyperGradient Descent for Bilevel Optimization

Abstract

We present a method for solving general nonconvex-strongly-convex bilevel optimization problems. Our method---the Restarted Accelerated HyperGradient Descent (RAHGD) method---finds an $\epsilon$-first-order stationary point of the objective with $\tilde{\mathcal{O}}(\kappa^{3.25}\epsilon^{-1.75})$ oracle complexity, where $\kappa$ is the condition number of the lower-level objective and $\epsilon$ is the desired accuracy. We also propose a perturbed variant of RAHGD for finding an $\big(\epsilon,\mathcal{O}(\kappa^{2.5}\sqrt{\epsilon}\ )\big)$-second-order stationary point within the same order of oracle complexity. Our results achieve the best-known theoretical guarantees for finding stationary points in bilevel optimization and also improve upon the existing upper complexity bound for finding second-order stationary points in nonconvex-strongly-concave minimax optimization problems, setting a new state-of-the-art benchmark. Empirical studies are conducted to validate the theoretical results in this paper.

Cite

Text

Yang et al. "Accelerating Inexact HyperGradient Descent for Bilevel Optimization." NeurIPS 2023 Workshops: OPT, 2023.

Markdown

[Yang et al. "Accelerating Inexact HyperGradient Descent for Bilevel Optimization." NeurIPS 2023 Workshops: OPT, 2023.](https://mlanthology.org/neuripsw/2023/yang2023neuripsw-accelerating/)

BibTeX

@inproceedings{yang2023neuripsw-accelerating,
  title     = {{Accelerating Inexact HyperGradient Descent for Bilevel Optimization}},
  author    = {Yang, Haikuo and Luo, Luo and Li, Chris Junchi and Jordan, Michael and Fazel, Maryam},
  booktitle = {NeurIPS 2023 Workshops: OPT},
  year      = {2023},
  url       = {https://mlanthology.org/neuripsw/2023/yang2023neuripsw-accelerating/}
}