Gradient Boosting Versus Mixed Integer Programming for Sparse Additive Modeling

Abstract

Gradient boosting is a widely used algorithm for fitting sparse additive models over flexible classes of basis functions. Despite its popularity, the performance of gradient boosting as an approximation algorithm to the empirical risk minimizing model with a specific number k of selected basis functions is poorly understood. We provide a theoretical lower bound of $1/2-1/(4k-2)$ 1 / 2 - 1 / ( 4 k - 2 ) on the worst-case approximation ratio for the risk reduction that gradient boosting achieves relative to the optimal model when both are limited to k terms. This result reveals an inherent limitation in boosting’s ability to approximate the best possible sparse additive model, raising the question of how tight and representative this bound is in practice. To empirically answer this question, we employ mixed integer programming (MIP) to approximate the optimal additive models on 21 real datasets. The experimental results do not show larger gaps than the theoretical analysis, indicating that the theoretical lower bound is tight. Moreover, for twelve datasets, the approximation gaps are of the same order of magnitude as the theoretical lower bound, which shows the representativeness of the theoretical bound. To that end, the study also has the practical implication that the presented MIP approach frequently offers notable improvements over gradient boosting.

Cite

Text

Yang et al. "Gradient Boosting Versus Mixed Integer Programming for Sparse Additive Modeling." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2025. doi:10.1007/978-3-032-06078-5_26

Markdown

[Yang et al. "Gradient Boosting Versus Mixed Integer Programming for Sparse Additive Modeling." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2025.](https://mlanthology.org/ecmlpkdd/2025/yang2025ecmlpkdd-gradient/) doi:10.1007/978-3-032-06078-5_26

BibTeX

@inproceedings{yang2025ecmlpkdd-gradient,
  title     = {{Gradient Boosting Versus Mixed Integer Programming for Sparse Additive Modeling}},
  author    = {Yang, Fan and Le Bodic, Pierre and Boley, Mario},
  booktitle = {European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases},
  year      = {2025},
  pages     = {453-470},
  doi       = {10.1007/978-3-032-06078-5_26},
  url       = {https://mlanthology.org/ecmlpkdd/2025/yang2025ecmlpkdd-gradient/}
}