TabPFGen – Tabular Data Generation with TabPFN

Abstract

Advances in deep generative modelling have not translated well to tabular data. We argue that this is caused by a mismatch in structure between popular generative models and _discriminative_ models of tabular data. We thus devise a technique to turn TabPFN -- a highly performant transformer initially designed for in-context discriminative tabular tasks -- into an energy-based generative model, which we dub _TabPFGen_. This novel framework leverages the pre-trained TabPFN as part of the energy function and does not require any additional training or hyperparameter tuning, thus inheriting TabPFN's in-context learning capability. We can sample from TabPFGen analogously to other energy-based models. We demonstrate strong results on standard generative modelling tasks, including data augmentation, class-balancing, and imputation, unlocking a new frontier of tabular data generation.

Cite

Text

Ma et al. "TabPFGen – Tabular Data Generation with TabPFN." NeurIPS 2023 Workshops: TRL, 2023.

Markdown

[Ma et al. "TabPFGen – Tabular Data Generation with TabPFN." NeurIPS 2023 Workshops: TRL, 2023.](https://mlanthology.org/neuripsw/2023/ma2023neuripsw-tabpfgen/)

BibTeX

@inproceedings{ma2023neuripsw-tabpfgen,
  title     = {{TabPFGen – Tabular Data Generation with TabPFN}},
  author    = {Ma, Junwei and Dankar, Apoorv and Stein, George and Yu, Guangwei and Caterini, Anthony},
  booktitle = {NeurIPS 2023 Workshops: TRL},
  year      = {2023},
  url       = {https://mlanthology.org/neuripsw/2023/ma2023neuripsw-tabpfgen/}
}