Adversarial Masking for Pretraining ECG Data Improves Downstream Model Generalizability

Abstract

Medical datasets often face the problem of data scarcity, as ground truth labels must be generated by medical professionals. One mitigation strategy is to pretrain deep learning models on large, unlabelled datasets with self-supervised learning (SSL), but this introduces the issue of domain shift if the pretraining and task dataset distributions differ. Data augmentations are essential for improving the generalizability of SSL-pretrained models, but they tend to be either handcrafted or randomly applied. We use an adversarial model to generate masks as augmentations for 12-lead electrocardiogram (ECG) data, where masks learn to occlude diagnostically-relevant regions. Compared to random augmentations, models pretrained with adversarial masking reaches better accuracy under a domain shift condition and in data-scarce regimes on two diverse downstream tasks, arrhythmia classification and patient age estimation. Adversarial masking is competitive with and even reaches further improvements when combined with state-of-art ECG augmentation methods, 3KG and random lead masking (RLM), demonstrating the generalizability of our method.

Cite

Text

Bo et al. "Adversarial Masking for Pretraining ECG Data Improves Downstream Model Generalizability." NeurIPS 2022 Workshops: TS4H, 2022.

Markdown

[Bo et al. "Adversarial Masking for Pretraining ECG Data Improves Downstream Model Generalizability." NeurIPS 2022 Workshops: TS4H, 2022.](https://mlanthology.org/neuripsw/2022/bo2022neuripsw-adversarial/)

BibTeX

@inproceedings{bo2022neuripsw-adversarial,
  title     = {{Adversarial Masking for Pretraining ECG Data Improves Downstream Model Generalizability}},
  author    = {Bo, Jessica and Huang, Hen-Wei and Chan, Alvin and Traverso, Giovanni},
  booktitle = {NeurIPS 2022 Workshops: TS4H},
  year      = {2022},
  url       = {https://mlanthology.org/neuripsw/2022/bo2022neuripsw-adversarial/}
}