MEDS-Torch: An ML Pipeline for Inductive Experiments for EHR Medical Foundation Models

Abstract

We introduce MEDS-Torch, a scalable and extensible pipeline for inductive experiments with sequence models on medical datasets adhering to the MEDS format—a universal schema for medical time series data. Using this pipeline, we systematically compare three tokenization methods (Everything In Code, Triplet, and Text Code) and evaluate five transfer learning techniques, including autoregressive generative modeling and contrastive learning variations, across multiple predictive tasks on the MIMIC-IV EHR dataset. Our empirical analysis provides actionable insights into the effectiveness of each method, demonstrating significant performance differences among tokenization and pretraining combinations. By benchmarking these approaches against fully supervised learning models, we offer practical recommendations for selecting appropriate modeling strategies in diverse healthcare settings. MEDS-Torch streamlines the process of running controlled experiments on medical datasets and promotes reproducibility and standardization in EHR research through its exclusive dependence on the MEDS schema, facilitating more effective machine learning experiments in healthcare without reliance on dataset-specific nuances.

Cite

Text

Oufattole et al. "MEDS-Torch: An ML Pipeline for Inductive Experiments for EHR Medical Foundation Models." NeurIPS 2024 Workshops: TSALM, 2024.

Markdown

[Oufattole et al. "MEDS-Torch: An ML Pipeline for Inductive Experiments for EHR Medical Foundation Models." NeurIPS 2024 Workshops: TSALM, 2024.](https://mlanthology.org/neuripsw/2024/oufattole2024neuripsw-medstorch/)

BibTeX

@inproceedings{oufattole2024neuripsw-medstorch,
  title     = {{MEDS-Torch: An ML Pipeline for Inductive Experiments for EHR Medical Foundation Models}},
  author    = {Oufattole, Nassim and Bergamaschi, Teya and Renc, Pawel and Kolo, Aleksia and McDermott, Matthew B.A. and Stultz, Collin},
  booktitle = {NeurIPS 2024 Workshops: TSALM},
  year      = {2024},
  url       = {https://mlanthology.org/neuripsw/2024/oufattole2024neuripsw-medstorch/}
}