EMC$^2$: Efficient MCMC Negative Sampling for Contrastive Learning with Global Convergence

ICML 2024 pp. 56966-56981

Abstract

A key challenge in contrastive learning is to generate negative samples from a large sample set to contrast with positive samples, for learning better encoding of the data. These negative samples often follow a softmax distribution which are dynamically updated during the training process. However, sampling from this distribution is non-trivial due to the high computational costs in computing the partition function. In this paper, we propose an $\underline{\text{E}}$fficient $\underline{\text{M}}$arkov $\underline{\text{C}}$hain Monte Carlo negative sampling method for $\underline{\text{C}}$ontrastive learning (EMC$^2$). We follow the global contrastive learning loss as introduced in SogCLR, and propose EMC$^2$ which utilizes an adaptive Metropolis-Hastings subroutine to generate hardness-aware negative samples in an online fashion during the optimization. We prove that EMC$^2$ finds an $\mathcal{O}(1/\sqrt{T})$-stationary point of the global contrastive loss in $T$ iterations. Compared to prior works, EMC$^2$ is the first algorithm that exhibits global convergence (to stationarity) regardless of the choice of batch size while exhibiting low computation and memory cost. Numerical experiments validate that EMC$^2$ is effective with small batch training and achieves comparable or better performance than baseline algorithms. We report the results for pre-training image encoders on STL-10 and Imagenet-100.

Cite

Text

Yau et al. "EMC$^2$: Efficient MCMC Negative Sampling for Contrastive Learning with Global Convergence." International Conference on Machine Learning, 2024.

Markdown

[Yau et al. "EMC$^2$: Efficient MCMC Negative Sampling for Contrastive Learning with Global Convergence." International Conference on Machine Learning, 2024.](https://mlanthology.org/icml/2024/yau2024icml-emc/)

BibTeX

@inproceedings{yau2024icml-emc,
  title     = {{EMC$^2$: Efficient MCMC Negative Sampling for Contrastive Learning with Global Convergence}},
  author    = {Yau, Chung-Yiu and Wai, Hoi To and Raman, Parameswaran and Sarkar, Soumajyoti and Hong, Mingyi},
  booktitle = {International Conference on Machine Learning},
  year      = {2024},
  pages     = {56966-56981},
  volume    = {235},
  url       = {https://mlanthology.org/icml/2024/yau2024icml-emc/}
}