Unpaired Multimodal Learning for Biological Datasets

Abstract

Multimodal learning holds tremendous promise for biology, providing a path to integrate diverse data types and ultimately construct a more complete picture of underlying biological mechanisms. However, most existing approaches for multimodal learning require paired samples—an impractical assumption in biology, where measurement devices often destroy samples (e.g., RNA sequencing). To address this challenge, we introduce IntraPair InterCluster (IPIC), a novel contrastive approach for multimodal learning that departs from traditional reliance on paired data by requiring only treatment-group labels. IPIC aligns modalities through intra-treatment group matching and inter-treatment group clustering, producing embeddings that are both accurate and biologically meaningful. In experiments on four curated multimodal biological datasets, IPIC consistently outperforms baseline approaches, highlighting its effectiveness in leveraging independently collected single-modality datasets for multimodal contrastive pre-training.

Cite

Text

Ji et al. "Unpaired Multimodal Learning for Biological Datasets." Proceedings of The 9th International Conference on Medical Imaging with Deep Learning, 2026.

Markdown

[Ji et al. "Unpaired Multimodal Learning for Biological Datasets." Proceedings of The 9th International Conference on Medical Imaging with Deep Learning, 2026.](https://mlanthology.org/midl/2026/ji2026midl-unpaired/)

BibTeX

@inproceedings{ji2026midl-unpaired,
  title     = {{Unpaired Multimodal Learning for Biological Datasets}},
  author    = {Ji, Zongliang and Eastwood, Cian and Goldenberg, Anna and Liang, Paul Pu and Hartford, Jason and Krishnan, Rahul G. and Noutahi, Emmanuel},
  booktitle = {Proceedings of The 9th International Conference on Medical Imaging with Deep Learning},
  year      = {2026},
  pages     = {1840-1868},
  volume    = {315},
  url       = {https://mlanthology.org/midl/2026/ji2026midl-unpaired/}
}