An Accelerated Proximal Algorithm for Regularized Nonconvex and Nonsmooth Bi-Level Optimization

Abstract

Many important machine learning applications involve regularized nonconvex bi-level optimization. However, the existing gradient-based bi-level optimization algorithms cannot handle nonconvex or nonsmooth regularizers, and they suffer from a high computation complexity in nonconvex bi-level optimization. In this work, we study a proximal gradient-type algorithm that adopts the approximate implicit differentiation (AID) scheme for nonconvex bi-level optimization with possibly nonconvex and nonsmooth regularizers. In particular, the algorithm applies the Nesterov’s momentum to accelerate the computation of the implicit gradient involved in AID. We provide a comprehensive analysis of the global convergence properties of this algorithm through identifying its intrinsic potential function. In particular, we formally establish the convergence of the model parameters to a critical point of the bi-level problem, and obtain an improved computation complexity $\widetilde{\mathcal {O}}(\kappa ^{3.5}\epsilon ^{-2})$ O ~ ( κ 3.5 ϵ - 2 ) over the state-of-the-art result. Moreover, we analyze the asymptotic convergence rates of this algorithm under a class of local nonconvex geometries characterized by a Łojasiewicz-type gradient inequality. Experiment on hyper-parameter optimization demonstrates the effectiveness of our algorithm.

Cite

Text

Chen et al. "An Accelerated Proximal Algorithm for Regularized Nonconvex and Nonsmooth Bi-Level Optimization." Machine Learning, 2023. doi:10.1007/S10994-023-06329-6

Markdown

[Chen et al. "An Accelerated Proximal Algorithm for Regularized Nonconvex and Nonsmooth Bi-Level Optimization." Machine Learning, 2023.](https://mlanthology.org/mlj/2023/chen2023mlj-accelerated/) doi:10.1007/S10994-023-06329-6

BibTeX

@article{chen2023mlj-accelerated,
  title     = {{An Accelerated Proximal Algorithm for Regularized Nonconvex and Nonsmooth Bi-Level Optimization}},
  author    = {Chen, Ziyi and Kailkhura, Bhavya and Zhou, Yi},
  journal   = {Machine Learning},
  year      = {2023},
  pages     = {1433-1463},
  doi       = {10.1007/S10994-023-06329-6},
  volume    = {112},
  url       = {https://mlanthology.org/mlj/2023/chen2023mlj-accelerated/}
}