scContrast: A Contrastive Learning Based Approach for Encoding Single-Cell Gene Expression Data

Abstract

Abstract Single-cell RNA sequencing (scRNA-seq) captures gene expression at a individual cell resolution, which reveals critical insights into cellular diversity, disease processes, and developmental biology. However, a key challenge in scRNA-seq analysis is clustering similar cells across multiple batches, particularly when distinct sequencing protocols are used. In this work, we present scContrast, a semi-supervised contrastive learning method tailored for embedding scRNA-seq data from both plate- and droplet-based protocols into a universal representation space. By leveraging five simple augmentations, scContrast extracts biologically relevant signals from gene expression data while filtering out batch effects and technical artifacts. We trained scContrast on a subset of Tabula Muris tissues and evaluated its zero-shot performance on unseen tissues. Our results demonstrate that scContrast generalizes effectively to new tissues and outperforms the leading UCE approach in integrating scRNA-seq data from droplet- and plate-based sequencing protocols.

Cite

Text

Li et al. "scContrast: A Contrastive Learning Based Approach for Encoding Single-Cell Gene Expression Data." ICLR 2025 Workshops: AI4NA, 2025. doi:10.1101/2025.04.07.647292

Markdown

[Li et al. "scContrast: A Contrastive Learning Based Approach for Encoding Single-Cell Gene Expression Data." ICLR 2025 Workshops: AI4NA, 2025.](https://mlanthology.org/iclrw/2025/li2025iclrw-sccontrast/) doi:10.1101/2025.04.07.647292

BibTeX

@inproceedings{li2025iclrw-sccontrast,
  title     = {{scContrast: A Contrastive Learning Based Approach for Encoding Single-Cell Gene Expression Data}},
  author    = {Li, Winston Yuxiang and Murtaza, Ghulam and Singh, Ritambhara},
  booktitle = {ICLR 2025 Workshops: AI4NA},
  year      = {2025},
  doi       = {10.1101/2025.04.07.647292},
  url       = {https://mlanthology.org/iclrw/2025/li2025iclrw-sccontrast/}
}