Supervised Contrastive Block Disentanglement

Abstract

Real-world datasets often combine data collected under different experimental conditions. This yields larger datasets, but also introduces spurious correlations that make it difficult to model the phenomena of interest. We address this by learning two embeddings to independently represent the phenomena of interest and the spurious correlations. The embedding representing the phenomena of interest is correlated with the target variable $y$, and is invariant to the environment variable $e$. In contrast, the embedding representing the spurious correlations is correlated with $e$. The invariance to $e$ is difficult to achieve on real-world datasets. Our primary contribution is an algorithm called Supervised Contrastive Block Disentanglement (SCBD) that effectively enforces this invariance. It is based purely on Supervised Contrastive Learning, and applies to real-world data better than existing approaches. We empirically validate SCBD on the real-world problem of batch correction. Using a dataset of 26 million Optical Pooled Screening images, we learn embeddings for \num{5050} genetic perturbations that are nearly free of technical artifacts that arise from unintended variation across wells.

Cite

Text

Makino et al. "Supervised Contrastive Block Disentanglement." ICLR 2025 Workshops: MLGenX, 2025.

Markdown

[Makino et al. "Supervised Contrastive Block Disentanglement." ICLR 2025 Workshops: MLGenX, 2025.](https://mlanthology.org/iclrw/2025/makino2025iclrw-supervised/)

BibTeX

@inproceedings{makino2025iclrw-supervised,
  title     = {{Supervised Contrastive Block Disentanglement}},
  author    = {Makino, Taro and Park, Ji Won and Tagasovska, Natasa and Kudo, Takamasa and Coelho, Paula and Yao, Heming and Huetter, Jan-Christian and Leote, Ana Carolina and Hoeckendorf, Burkhard and Ra, Stephen and Richmond, David and Cho, Kyunghyun and Regev, Aviv and Lopez, Romain},
  booktitle = {ICLR 2025 Workshops: MLGenX},
  year      = {2025},
  url       = {https://mlanthology.org/iclrw/2025/makino2025iclrw-supervised/}
}