TabPFGen – Tabular Data Generation with TabPFN
Abstract
Advances in deep generative modelling have not translated well to tabular data. We argue that this is caused by a mismatch in structure between popular generative models and _discriminative_ models of tabular data. We thus devise a technique to turn TabPFN -- a highly performant transformer initially designed for in-context discriminative tabular tasks -- into an energy-based generative model, which we dub _TabPFGen_. This novel framework leverages the pre-trained TabPFN as part of the energy function and does not require any additional training or hyperparameter tuning, thus inheriting TabPFN's in-context learning capability. We can sample from TabPFGen analogously to other energy-based models. We demonstrate strong results on standard generative modelling tasks, including data augmentation, class-balancing, and imputation, unlocking a new frontier of tabular data generation.
Cite
Text
Ma et al. "TabPFGen – Tabular Data Generation with TabPFN." NeurIPS 2023 Workshops: TRL, 2023.Markdown
[Ma et al. "TabPFGen – Tabular Data Generation with TabPFN." NeurIPS 2023 Workshops: TRL, 2023.](https://mlanthology.org/neuripsw/2023/ma2023neuripsw-tabpfgen/)BibTeX
@inproceedings{ma2023neuripsw-tabpfgen,
title = {{TabPFGen – Tabular Data Generation with TabPFN}},
author = {Ma, Junwei and Dankar, Apoorv and Stein, George and Yu, Guangwei and Caterini, Anthony},
booktitle = {NeurIPS 2023 Workshops: TRL},
year = {2023},
url = {https://mlanthology.org/neuripsw/2023/ma2023neuripsw-tabpfgen/}
}