Transformers Can Do Bayesian-Inference by Meta-Learning on Prior-Data
Abstract
Currently, it is hard to reap the benefits of deep learning for Bayesian methods. We present Prior-Data Fitted Networks (PFNs), a method that allows to employ large-scale machine learning techniques to approximate a large set of posteriors. The only requirement for PFNs is the ability to sample from a prior distribution over supervised learning tasks (or functions). The method repeatedly draws a task (or function) from this prior, draws a set of data points and their labels from it, masks one of the labels and learns to make probabilistic predictions for it based on the set-valued input of the rest of the data points. Presented with samples from a new supervised learning task as input, it can then make probabilistic predictions for arbitrary other data points in a single forward propagation, effectively having learned to perform Bayesian inference. We demonstrate that PFNs can near-perfectly mimic Gaussian processes and also enable efficient Bayesian inference for intractable problems, with over 200-fold speedups in multiple setups compared to current methods. We obtain strong results in such diverse areas as Gaussian process regression and Bayesian neural networks, demonstrating the generality of PFNs.
Cite
Text
Müller et al. "Transformers Can Do Bayesian-Inference by Meta-Learning on Prior-Data." NeurIPS 2021 Workshops: MetaLearn, 2021.Markdown
[Müller et al. "Transformers Can Do Bayesian-Inference by Meta-Learning on Prior-Data." NeurIPS 2021 Workshops: MetaLearn, 2021.](https://mlanthology.org/neuripsw/2021/muller2021neuripsw-transformers/)BibTeX
@inproceedings{muller2021neuripsw-transformers,
title = {{Transformers Can Do Bayesian-Inference by Meta-Learning on Prior-Data}},
author = {Müller, Samuel and Hollmann, Noah and Arango, Sebastian Pineda and Grabocka, Josif and Hutter, Frank},
booktitle = {NeurIPS 2021 Workshops: MetaLearn},
year = {2021},
url = {https://mlanthology.org/neuripsw/2021/muller2021neuripsw-transformers/}
}