Efficient Bayesian Learning Curve Extrapolation Using Prior-Data Fitted Networks

Abstract

Learning curve extrapolation aims to predict model performance in later epochs of a machine learning training, based on the performance in the first k epochs. In this work, we argue that, while the varying difficulty of extrapolating learning curves warrants a Bayesian approach, existing methods are (i) overly restrictive, and/or (ii) computationally expensive. We describe the first application of prior-data fitted neural networks (PFNs) in this context. PFNs use a transformer, pre-trained on data generated from a prior, to perform approximate Bayesian inference in a single forward pass. We present preliminary results, demonstrating that PFNs can more accurately approximate the posterior predictive distribution multiple orders of magnitude faster than MCMC, as well as obtain a lower average error predicting final accuracy obtained by real learning curve data from LCBench.

PDF NeurIPSW OpenReview Semantic Scholar

Cite

Text

Adriaensen et al. "Efficient Bayesian Learning Curve Extrapolation Using Prior-Data Fitted Networks." NeurIPS 2022 Workshops: MetaLearn, 2022.

Markdown

[Adriaensen et al. "Efficient Bayesian Learning Curve Extrapolation Using Prior-Data Fitted Networks." NeurIPS 2022 Workshops: MetaLearn, 2022.](https://mlanthology.org/neuripsw/2022/adriaensen2022neuripsw-efficient/)

BibTeX

@inproceedings{adriaensen2022neuripsw-efficient,
  title     = {{Efficient Bayesian Learning Curve Extrapolation Using Prior-Data Fitted Networks}},
  author    = {Adriaensen, Steven and Rakotoarison, Herilalaina and Müller, Samuel and Hutter, Frank},
  booktitle = {NeurIPS 2022 Workshops: MetaLearn},
  year      = {2022},
  url       = {https://mlanthology.org/neuripsw/2022/adriaensen2022neuripsw-efficient/}
}