Contrastive Learning Is Spectral Clustering on Similarity Graph

Abstract

Contrastive learning is a powerful self-supervised learning method, but we have a limited theoretical understanding of how it works and why it works. In this paper, we prove that contrastive learning with the standard InfoNCE loss is equivalent to spectral clustering on the similarity graph. Using this equivalence as the building block, we extend our analysis to the CLIP model and rigorously characterize how similar multi-modal objects are embedded together. Motivated by our theoretical insights, we introduce the Kernel-InfoNCE loss, incorporating mixtures of kernel functions that outperform the standard Gaussian kernel on several vision datasets.

Cite

Text

Tan et al. "Contrastive Learning Is Spectral Clustering on Similarity Graph." International Conference on Learning Representations, 2024.

Markdown

[Tan et al. "Contrastive Learning Is Spectral Clustering on Similarity Graph." International Conference on Learning Representations, 2024.](https://mlanthology.org/iclr/2024/tan2024iclr-contrastive/)

BibTeX

@inproceedings{tan2024iclr-contrastive,
  title     = {{Contrastive Learning Is Spectral Clustering on Similarity Graph}},
  author    = {Tan, Zhiquan and Zhang, Yifan and Yang, Jingqin and Yuan, Yang},
  booktitle = {International Conference on Learning Representations},
  year      = {2024},
  url       = {https://mlanthology.org/iclr/2024/tan2024iclr-contrastive/}
}