IGAMT: Privacy-Preserving Electronic Health Record Synthesization with Heterogeneity and Irregularity

Abstract

Utilizing electronic health records (EHR) for machine learning-driven clinical research has great potential to enhance outcome predictions and treatment personalization. Nonetheless, due to privacy and security concerns, the secondary use of EHR data is regulated, constraining researchers' access to EHR data. Generating synthetic EHR data with deep learning methods is a viable and promising approach to mitigate privacy concerns, offering not only a supplementary resource for downstream applications but also sidestepping the privacy risks associated with real patient data. While prior efforts have concentrated on EHR data synthesis, significant challenges persist: addressing the heterogeneity of features including temporal and non-temporal features, structurally missing values, and irregularity of the temporal measures, and ensuring rigorous privacy of the real data used for model training. Existing works in this domain only focused on solving one or two aforementioned challenges. In this work, we propose IGAMT, an innovative framework to generate privacy-preserved synthetic EHR data that not only maintains high quality with heterogeneous features, missing values, and irregular measures but also achieves differential privacy with enhanced privacy-utility trade-off. Extensive experiments prove that IGAMT significantly outperforms baseline and state-of-the-art models in terms of resemblance to real data and performance of downstream applications. Ablation studies also prove the effectiveness of the techniques applied in IGAMT.

Cite

Text

Wang et al. "IGAMT: Privacy-Preserving Electronic Health Record Synthesization with Heterogeneity and Irregularity." AAAI Conference on Artificial Intelligence, 2024. doi:10.1609/AAAI.V38I14.29491

Markdown

[Wang et al. "IGAMT: Privacy-Preserving Electronic Health Record Synthesization with Heterogeneity and Irregularity." AAAI Conference on Artificial Intelligence, 2024.](https://mlanthology.org/aaai/2024/wang2024aaai-igamt/) doi:10.1609/AAAI.V38I14.29491

BibTeX

@inproceedings{wang2024aaai-igamt,
  title     = {{IGAMT: Privacy-Preserving Electronic Health Record Synthesization with Heterogeneity and Irregularity}},
  author    = {Wang, Wenjie and Tang, Pengfei and Lou, Jian and Shao, Yuanming and Waller, Lance and Ko, Yi-an and Xiong, Li},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {2024},
  pages     = {15634-15643},
  doi       = {10.1609/AAAI.V38I14.29491},
  url       = {https://mlanthology.org/aaai/2024/wang2024aaai-igamt/}
}