Value Function Based Difference-of-Convex Algorithm for Bilevel Hyperparameter Selection Problems
Abstract
Existing gradient-based optimization methods for hyperparameter tuning can only guarantee theoretical convergence to stationary solutions when the bilevel program satisfies the condition that for fixed upper-level variables, the lower-level is strongly convex (LLSC) and smooth (LLS). This condition is not satisfied for bilevel programs arising from tuning hyperparameters in many machine learning algorithms. In this work, we develop a sequentially convergent Value Function based Difference-of-Convex Algorithm with inexactness (VF-iDCA). We then ask: can this algorithm achieve stationary solutions without LLSC and LLS assumptions? We provide a positive answer to this question for bilevel programs from a broad class of hyperparameter tuning applications. Extensive experiments justify our theoretical results and demonstrate the superiority of the proposed VF-iDCA when applied to tune hyperparameters.
Cite
Text
Gao et al. "Value Function Based Difference-of-Convex Algorithm for Bilevel Hyperparameter Selection Problems." International Conference on Machine Learning, 2022.Markdown
[Gao et al. "Value Function Based Difference-of-Convex Algorithm for Bilevel Hyperparameter Selection Problems." International Conference on Machine Learning, 2022.](https://mlanthology.org/icml/2022/gao2022icml-value/)BibTeX
@inproceedings{gao2022icml-value,
title = {{Value Function Based Difference-of-Convex Algorithm for Bilevel Hyperparameter Selection Problems}},
author = {Gao, Lucy L and Ye, Jane and Yin, Haian and Zeng, Shangzhi and Zhang, Jin},
booktitle = {International Conference on Machine Learning},
year = {2022},
pages = {7164-7182},
volume = {162},
url = {https://mlanthology.org/icml/2022/gao2022icml-value/}
}