ECG2TOK: ECG Pre-Training with Self-Distillation Semantic Tokenizers

Abstract

Self-supervised learning (SSL) has garnered increasing attention in electrocardiogram (ECG) analysis for its effectiveness in resource-limited settings. Existing state-of-the-art SSL methods rely on time-frequency detail reconstruction, but due to the inherent redundancy of ECG signals and individual variability, these approaches often yield suboptimal performance. In contrast, discrete label prediction becomes a superior pre-training objective by encouraging models to efficiently abstract ECG high-level semantics. However, the continuity and significant variability of ECG signals pose a challenge in generating semantically discrete labels. To address this issue, we propose an ECG pretraining framework with a self-distillation semantic tokenizer (ECG2TOK), which maps continuous ECG signals into discrete labels for self-supervised training. Specifically, the tokenizer extracts semantically aware embeddings of ECG by self-distillation and performs online clustering to generate semantically rich discrete labels. Subsequently, the SSL model is trained in conjunction with masking strategies and discrete label prediction to facilitate the abstraction of high-level semantic representations. We evaluate ECG2TOK in six downstream tasks, demonstrating that ECG2TOK efficiently achieves state-of-the-art performance and up to a 30.73% AUC increase in low-resource scenarios. Moreover, visualization experiments demonstrate that the discrete labels generated by ECG2TOK exhibit consistent semantics closely associated with clinical features. Our code is available on https://github.com/YXYanova/ECG2TOK.

Cite

Text

Yuan et al. "ECG2TOK: ECG Pre-Training with Self-Distillation Semantic Tokenizers." International Joint Conference on Artificial Intelligence, 2025. doi:10.24963/IJCAI.2025/1110

Markdown

[Yuan et al. "ECG2TOK: ECG Pre-Training with Self-Distillation Semantic Tokenizers." International Joint Conference on Artificial Intelligence, 2025.](https://mlanthology.org/ijcai/2025/yuan2025ijcai-ecg/) doi:10.24963/IJCAI.2025/1110

BibTeX

@inproceedings{yuan2025ijcai-ecg,
  title     = {{ECG2TOK: ECG Pre-Training with Self-Distillation Semantic Tokenizers}},
  author    = {Yuan, Xiaoyan and Wang, Wei and Liu, Han and Chen, Jian and Hu, Xiping},
  booktitle = {International Joint Conference on Artificial Intelligence},
  year      = {2025},
  pages     = {9990-9998},
  doi       = {10.24963/IJCAI.2025/1110},
  url       = {https://mlanthology.org/ijcai/2025/yuan2025ijcai-ecg/}
}