Mitigating Spurious Features in Contrastive Learning with Spectral Regularization

Abstract

Neural networks generally prefer simple and easy-to-learn features. When these features are spuriously correlated with the labels, the network's performance can suffer, particularly for underrepresented classes or concepts. Self-supervised representation learning methods, such as contrastive learning, are especially prone to this issue, often resulting in worse performance on downstream tasks. We identify a key spectral signature of this failure: early reliance on dominant singular modes of the learned feature matrix. To mitigate this, we propose a novel framework that promotes a uniform eigenspectrum of the feature covariance matrix, encouraging diverse and semantically rich representations. Our method operates in a fully self-supervised setting, without relying on ground-truth labels or any additional information. Empirical results on SimCLR and SimSiam demonstrate consistent gains in robustness and transfer performance, suggesting broad applicability across self-supervised learning paradigms. Code: https://github.com/NaghmehGh/SpuriousCorrelation_SSRL

Cite

Text

Ghanooni et al. "Mitigating Spurious Features in Contrastive Learning with Spectral Regularization." Advances in Neural Information Processing Systems, 2025.

Markdown

[Ghanooni et al. "Mitigating Spurious Features in Contrastive Learning with Spectral Regularization." Advances in Neural Information Processing Systems, 2025.](https://mlanthology.org/neurips/2025/ghanooni2025neurips-mitigating/)

BibTeX

@inproceedings{ghanooni2025neurips-mitigating,
  title     = {{Mitigating Spurious Features in Contrastive Learning with Spectral Regularization}},
  author    = {Ghanooni, Naghmeh and Mustafa, Waleed and Wagner, Dennis and Fellenz, Sophie and Lin, Anthony Widjaja and Kloft, Marius},
  booktitle = {Advances in Neural Information Processing Systems},
  year      = {2025},
  url       = {https://mlanthology.org/neurips/2025/ghanooni2025neurips-mitigating/}
}