Efficient Bayesian Learning Curve Extrapolation Using Prior-Data Fitted Networks

Abstract

Learning curve extrapolation aims to predict model performance in later epochs of a machine learning training, based on the performance in the first k epochs. In this work, we argue that, while the varying difficulty of extrapolating learning curves warrants a Bayesian approach, existing methods are (i) overly restrictive, and/or (ii) computationally expensive. We describe the first application of prior-data fitted neural networks (PFNs) in this context. PFNs use a transformer, pre-trained on data generated from a prior, to perform approximate Bayesian inference in a single forward pass. We present preliminary results, demonstrating that PFNs can more accurately approximate the posterior predictive distribution multiple orders of magnitude faster than MCMC, as well as obtain a lower average error predicting final accuracy obtained by real learning curve data from LCBench.

Cite

Text

Adriaensen et al. "Efficient Bayesian Learning Curve Extrapolation Using Prior-Data Fitted Networks." NeurIPS 2022 Workshops: MetaLearn, 2022.

Markdown

[Adriaensen et al. "Efficient Bayesian Learning Curve Extrapolation Using Prior-Data Fitted Networks." NeurIPS 2022 Workshops: MetaLearn, 2022.](https://mlanthology.org/neuripsw/2022/adriaensen2022neuripsw-efficient/)

BibTeX

@inproceedings{adriaensen2022neuripsw-efficient,
  title     = {{Efficient Bayesian Learning Curve Extrapolation Using Prior-Data Fitted Networks}},
  author    = {Adriaensen, Steven and Rakotoarison, Herilalaina and Müller, Samuel and Hutter, Frank},
  booktitle = {NeurIPS 2022 Workshops: MetaLearn},
  year      = {2022},
  url       = {https://mlanthology.org/neuripsw/2022/adriaensen2022neuripsw-efficient/}
}