Value Function Based Difference-of-Convex Algorithm for Bilevel Hyperparameter Selection Problems

Abstract

Existing gradient-based optimization methods for hyperparameter tuning can only guarantee theoretical convergence to stationary solutions when the bilevel program satisfies the condition that for fixed upper-level variables, the lower-level is strongly convex (LLSC) and smooth (LLS). This condition is not satisfied for bilevel programs arising from tuning hyperparameters in many machine learning algorithms. In this work, we develop a sequentially convergent Value Function based Difference-of-Convex Algorithm with inexactness (VF-iDCA). We then ask: can this algorithm achieve stationary solutions without LLSC and LLS assumptions? We provide a positive answer to this question for bilevel programs from a broad class of hyperparameter tuning applications. Extensive experiments justify our theoretical results and demonstrate the superiority of the proposed VF-iDCA when applied to tune hyperparameters.

Cite

Text

Gao et al. "Value Function Based Difference-of-Convex Algorithm for Bilevel Hyperparameter Selection Problems." International Conference on Machine Learning, 2022.

Markdown

[Gao et al. "Value Function Based Difference-of-Convex Algorithm for Bilevel Hyperparameter Selection Problems." International Conference on Machine Learning, 2022.](https://mlanthology.org/icml/2022/gao2022icml-value/)

BibTeX

@inproceedings{gao2022icml-value,
  title     = {{Value Function Based Difference-of-Convex Algorithm for Bilevel Hyperparameter Selection Problems}},
  author    = {Gao, Lucy L and Ye, Jane and Yin, Haian and Zeng, Shangzhi and Zhang, Jin},
  booktitle = {International Conference on Machine Learning},
  year      = {2022},
  pages     = {7164-7182},
  volume    = {162},
  url       = {https://mlanthology.org/icml/2022/gao2022icml-value/}
}