Boosting for Predictive Sufficiency
Abstract
Out-of-distribution (OOD) generalization is a defining hallmark of truly robust and reliable machine learning systems. Recently, it has been empirically observed that existing OOD generalization methods often underperform on real-world tabular data, where hidden confounding shifts drive distribution changes that boosting models handle more effectively. Part of boosting’s success is attributed to variance reduction, handling missing variables, feature selection, and connections to multicalibration. This paper uncovers a crucial reason behind its success in OOD generalization: boosting’s ability to infer stable environments robust to hidden confounding shifts and maximize predictive performance within those environments. This paper introduces an information-theoretic notion called $\alpha$-predictive sufficiency and formalizes its link to OOD generalization under hidden confounding. We show that boosting implicitly identifies suitable environments and produces an $\alpha$-predictive sufficient predictor. We validate our theoretical results through synthetic and real-world experiments and show that boosting achieves robust performance by identifying these environments and maximizing the association between predictions and true outcomes.
Cite
Text
Reddy et al. "Boosting for Predictive Sufficiency." International Conference on Learning Representations, 2026.Markdown
[Reddy et al. "Boosting for Predictive Sufficiency." International Conference on Learning Representations, 2026.](https://mlanthology.org/iclr/2026/reddy2026iclr-boosting/)BibTeX
@inproceedings{reddy2026iclr-boosting,
title = {{Boosting for Predictive Sufficiency}},
author = {Reddy, Abbavaram Gowtham and Verma, Rajeev and Rubio-Madrigal, Celia and Muandet, Krikamol and Burkholz, Rebekka},
booktitle = {International Conference on Learning Representations},
year = {2026},
url = {https://mlanthology.org/iclr/2026/reddy2026iclr-boosting/}
}