Advancing Radiograph Representation Learning with Masked Record Modeling

Abstract

Modern studies in radiograph representation learning (R$^2$L) rely on either self-supervision to encode invariant semantics or associated radiology reports to incorporate medical expertise, while the complementarity between them is barely noticed. To explore this, we formulate the self- and report-completion as two complementary objectives and present a unified framework based on masked record modeling (MRM). In practice, MRM reconstructs masked image patches and masked report tokens following a multi-task scheme to learn knowledge-enhanced semantic representations. With MRM pre-training, we obtain pre-trained models that can be well transferred to various radiography tasks. Specifically, we find that MRM offers superior performance in label-efficient fine-tuning. For instance, MRM achieves 88.5% mean AUC on CheXpert using 1% labeled data, outperforming previous R$^2$L methods with 100% labels. On NIH ChestX-ray, MRM outperforms the best performing counterpart by about 3% under small labeling ratios. Besides, MRM surpasses self- and report-supervised pre-training in identifying the pneumonia type and the pneumothorax area, sometimes by large margins.

Cite

Text

Zhou et al. "Advancing Radiograph Representation Learning with Masked Record Modeling." International Conference on Learning Representations, 2023.

Markdown

[Zhou et al. "Advancing Radiograph Representation Learning with Masked Record Modeling." International Conference on Learning Representations, 2023.](https://mlanthology.org/iclr/2023/zhou2023iclr-advancing/)

BibTeX

@inproceedings{zhou2023iclr-advancing,
  title     = {{Advancing Radiograph Representation Learning with Masked Record Modeling}},
  author    = {Zhou, Hong-Yu and Lian, Chenyu and Wang, Liansheng and Yu, Yizhou},
  booktitle = {International Conference on Learning Representations},
  year      = {2023},
  url       = {https://mlanthology.org/iclr/2023/zhou2023iclr-advancing/}
}