Adapting TabPFN for Zero-Inflated Metagenomic Data

Abstract

This paper introduces a novel prior assumption for TabPFN—a meta-learning method designed to approximate Bayesian inference on synthetic datasets generated from a predefined prior—aimed at better accommodating the unique zero-inflated distributions characteristic of metagenomic data. We modify the model's prior assumptions without changing its architecture by generating synthetic training data replicating the sparsity and variability inherent in these datasets. Preliminary results from metagenomic classification tasks show significant improvements in predictive performance, exceeding that of the original TabPFN and competing with state-of-the-art methods. This work emphasizes the necessity of tailoring PFN priors to align with the specific statistical properties of biomedical data, thereby enhancing their effectiveness in precision medicine.

Cite

Text

Perciballi et al. "Adapting TabPFN for Zero-Inflated Metagenomic Data." NeurIPS 2024 Workshops: TRL, 2024.

Markdown

[Perciballi et al. "Adapting TabPFN for Zero-Inflated Metagenomic Data." NeurIPS 2024 Workshops: TRL, 2024.](https://mlanthology.org/neuripsw/2024/perciballi2024neuripsw-adapting/)

BibTeX

@inproceedings{perciballi2024neuripsw-adapting,
  title     = {{Adapting TabPFN for Zero-Inflated Metagenomic Data}},
  author    = {Perciballi, Giulia and Granese, Federica and Fall, Ahmad and Zehraoui, Farida and Prifti, Edi and Zucker, Jean-Daniel},
  booktitle = {NeurIPS 2024 Workshops: TRL},
  year      = {2024},
  url       = {https://mlanthology.org/neuripsw/2024/perciballi2024neuripsw-adapting/}
}