Med-UniC: Unifying Cross-Lingual Medical Vision-Language Pre-Training by Diminishing Bias

Abstract

The scarcity of data presents a critical obstacle to the efficacy of medical vision-language pre-training (VLP). A potential solution lies in the combination of datasets from various language communities.Nevertheless, the main challenge stems from the complexity of integrating diverse syntax and semantics, language-specific medical terminology, and culture-specific implicit knowledge. Therefore, one crucial aspect to consider is the presence of community bias caused by different languages.This paper presents a novel framework named Unifying Cross-Lingual Medical Vision-Language Pre-Training (\textbf{Med-UniC}), designed to integrate multi-modal medical data from the two most prevalent languages, English and Spanish. Specifically, we propose \textbf{C}ross-lingual \textbf{T}ext Alignment \textbf{R}egularization (\textbf{CTR}) to explicitly unify cross-lingual semantic representations of medical reports originating from diverse language communities. \textbf{CTR} is optimized through latent language disentanglement, rendering our optimization objective to not depend on negative samples, thereby significantly mitigating the bias from determining positive-negative sample pairs within analogous medical reports. Furthermore, it ensures that the cross-lingual representation is not biased toward any specific language community.\textbf{Med-UniC} reaches superior performance across 5 medical image tasks and 10 datasets encompassing over 30 diseases, offering a versatile framework for unifying multi-modal medical data within diverse linguistic communities.The experimental outcomes highlight the presence of community bias in cross-lingual VLP. Reducing this bias enhances the performance not only in vision-language tasks but also in uni-modal visual tasks.

Cite

Text

Wan et al. "Med-UniC: Unifying Cross-Lingual Medical Vision-Language Pre-Training by Diminishing Bias." Neural Information Processing Systems, 2023.

Markdown

[Wan et al. "Med-UniC: Unifying Cross-Lingual Medical Vision-Language Pre-Training by Diminishing Bias." Neural Information Processing Systems, 2023.](https://mlanthology.org/neurips/2023/wan2023neurips-medunic/)

BibTeX

@inproceedings{wan2023neurips-medunic,
  title     = {{Med-UniC: Unifying Cross-Lingual Medical Vision-Language Pre-Training by Diminishing Bias}},
  author    = {Wan, Zhongwei and Liu, Che and Zhang, Mi and Fu, Jie and Wang, Benyou and Cheng, Sibo and Ma, Lei and Quilodrán-Casas, César and Arcucci, Rossella},
  booktitle = {Neural Information Processing Systems},
  year      = {2023},
  url       = {https://mlanthology.org/neurips/2023/wan2023neurips-medunic/}
}