Efficient Bayesian Learning Curve Extrapolation Using Prior-Data Fitted Networks
Abstract
Learning curve extrapolation aims to predict model performance in later epochs of a machine learning training, based on the performance in the first k epochs. In this work, we argue that, while the varying difficulty of extrapolating learning curves warrants a Bayesian approach, existing methods are (i) overly restrictive, and/or (ii) computationally expensive. We describe the first application of prior-data fitted neural networks (PFNs) in this context. PFNs use a transformer, pre-trained on data generated from a prior, to perform approximate Bayesian inference in a single forward pass. We present preliminary results, demonstrating that PFNs can more accurately approximate the posterior predictive distribution multiple orders of magnitude faster than MCMC, as well as obtain a lower average error predicting final accuracy obtained by real learning curve data from LCBench.
Cite
Text
Adriaensen et al. "Efficient Bayesian Learning Curve Extrapolation Using Prior-Data Fitted Networks." NeurIPS 2022 Workshops: MetaLearn, 2022.Markdown
[Adriaensen et al. "Efficient Bayesian Learning Curve Extrapolation Using Prior-Data Fitted Networks." NeurIPS 2022 Workshops: MetaLearn, 2022.](https://mlanthology.org/neuripsw/2022/adriaensen2022neuripsw-efficient/)BibTeX
@inproceedings{adriaensen2022neuripsw-efficient,
title = {{Efficient Bayesian Learning Curve Extrapolation Using Prior-Data Fitted Networks}},
author = {Adriaensen, Steven and Rakotoarison, Herilalaina and Müller, Samuel and Hutter, Frank},
booktitle = {NeurIPS 2022 Workshops: MetaLearn},
year = {2022},
url = {https://mlanthology.org/neuripsw/2022/adriaensen2022neuripsw-efficient/}
}