Learning Curves for Noisy Heterogeneous Feature-Subsampled Ridge Ensembles
Abstract
Feature bagging is a well-established ensembling method which aims to reduceprediction variance by combining predictions of many estimators trained on subsetsor projections of features. Here, we develop a theory of feature-bagging in noisyleast-squares ridge ensembles and simplify the resulting learning curves in the specialcase of equicorrelated data. Using analytical learning curves, we demonstratethat subsampling shifts the double-descent peak of a linear predictor. This leadsus to introduce heterogeneous feature ensembling, with estimators built on varyingnumbers of feature dimensions, as a computationally efficient method to mitigatedouble-descent. Then, we compare the performance of a feature-subsamplingensemble to a single linear predictor, describing a trade-off between noise amplificationdue to subsampling and noise reduction due to ensembling. Our qualitativeinsights carry over to linear classifiers applied to image classification tasks withrealistic datasets constructed using a state-of-the-art deep learning feature map.
Cite
Text
Ruben and Pehlevan. "Learning Curves for Noisy Heterogeneous Feature-Subsampled Ridge Ensembles." Neural Information Processing Systems, 2023.Markdown
[Ruben and Pehlevan. "Learning Curves for Noisy Heterogeneous Feature-Subsampled Ridge Ensembles." Neural Information Processing Systems, 2023.](https://mlanthology.org/neurips/2023/ruben2023neurips-learning/)BibTeX
@inproceedings{ruben2023neurips-learning,
title = {{Learning Curves for Noisy Heterogeneous Feature-Subsampled Ridge Ensembles}},
author = {Ruben, Ben and Pehlevan, Cengiz},
booktitle = {Neural Information Processing Systems},
year = {2023},
url = {https://mlanthology.org/neurips/2023/ruben2023neurips-learning/}
}