Post-Selection Confidence Bounds for Prediction Performance
Abstract
In machine learning, the selection of a promising model from a potentially large number of competing models and the assessment of its generalization performance are critical tasks that need careful consideration. Typically, model selection and evaluation are strictly separated tasks, splitting the sample at hand into training, validation, and evaluation sets, and only computing a single confidence interval for the prediction performance of the final selected model. We however regard the selection problem as a simultaneous inference problem and propose an algorithm to compute valid lower confidence bounds for multiple models that have been selected based on their prediction performance in the evaluation set. For this, we use bootstrap tilting and a maxT-type multiplicity correction. Various simulation experiments show that this leads to lower confidence bounds for the conditional performance that are at least as good as bounds from standard methods, and that reliably reach the nominal coverage probability. Also, a better performing final prediction model is selected this way, especially when the sample size is small. The approach is universally applicable for any combination of prediction models, any model selection strategy, and any prediction performance measure that accepts weights.
Cite
Text
Rink and Brannath. "Post-Selection Confidence Bounds for Prediction Performance." Machine Learning, 2025. doi:10.1007/S10994-024-06632-WMarkdown
[Rink and Brannath. "Post-Selection Confidence Bounds for Prediction Performance." Machine Learning, 2025.](https://mlanthology.org/mlj/2025/rink2025mlj-postselection/) doi:10.1007/S10994-024-06632-WBibTeX
@article{rink2025mlj-postselection,
title = {{Post-Selection Confidence Bounds for Prediction Performance}},
author = {Rink, Pascal and Brannath, Werner},
journal = {Machine Learning},
year = {2025},
pages = {82},
doi = {10.1007/S10994-024-06632-W},
volume = {114},
url = {https://mlanthology.org/mlj/2025/rink2025mlj-postselection/}
}