Learning Theory for Kernel Bilevel Optimization

Abstract

Bilevel optimization has emerged as a technique for addressing a wide range of machine learning problems that involve an outer objective implicitly determined by the minimizer of an inner problem. While prior works have primarily focused on the parametric setting, a learning-theoretic foundation for bilevel optimization in the nonparametric case remains relatively unexplored. In this paper, we take a first step toward bridging this gap by studying Kernel Bilevel Optimization (KBO), where the inner objective is optimized over a reproducing kernel Hilbert space. This setting enables rich function approximation while providing a foundation for rigorous theoretical analysis. In this context, we derive novel finite-sample generalization bounds for KBO, leveraging tools from empirical process theory. These bounds further allow us to assess the statistical accuracy of gradient-based methods applied to the empirical discretization of KBO. We numerically illustrate our theoretical findings on a synthetic instrumental variable regression task.

Cite

Text

El Khoury et al. "Learning Theory for Kernel Bilevel Optimization." Advances in Neural Information Processing Systems, 2025.

Markdown

[El Khoury et al. "Learning Theory for Kernel Bilevel Optimization." Advances in Neural Information Processing Systems, 2025.](https://mlanthology.org/neurips/2025/khoury2025neurips-learning/)

BibTeX

@inproceedings{khoury2025neurips-learning,
  title     = {{Learning Theory for Kernel Bilevel Optimization}},
  author    = {El Khoury, Fares and Pauwels, Edouard and Vaiter, Samuel and Arbel, Michael},
  booktitle = {Advances in Neural Information Processing Systems},
  year      = {2025},
  url       = {https://mlanthology.org/neurips/2025/khoury2025neurips-learning/}
}