Image Clustering via the Principle of Rate Reduction in the Age of Pretrained Models

Abstract

The advent of large pre-trained models has brought about a paradigm shift in both visual representation learning and natural language processing. However, clustering unlabeled images, as a fundamental and classic machine learning problem, still lacks an effective solution, particularly for large-scale datasets. In this paper, we propose a novel image clustering pipeline that leverages the powerful feature representation of large pre-trained models such as CLIP and cluster images effectively and efficiently at scale. We first developed a novel algorithm to estimate the number of clusters in a given dataset. We then show that the pre-trained features are significantly more structured by further optimizing the rate reduction objective. The resulting features may significantly improve the clustering accuracy, e.g., from 57\% to 66\% on ImageNet-1k. Furthermore, by leveraging CLIP's multimodality bridge between image and text, we develop a simple yet effective self-labeling algorithm that produces meaningful text labels for the clusters. Through extensive experiments, we show that our pipeline works well on standard datasets such as CIFAR-10, CIFAR-100, and ImageNet-1k. It also extends to datasets without predefined labels, such as LAION-Aesthetics and WikiArts.

Cite

Text

Chu et al. "Image Clustering via the Principle of Rate Reduction in the Age of Pretrained Models." International Conference on Learning Representations, 2024.

Markdown

[Chu et al. "Image Clustering via the Principle of Rate Reduction in the Age of Pretrained Models." International Conference on Learning Representations, 2024.](https://mlanthology.org/iclr/2024/chu2024iclr-image/)

BibTeX

@inproceedings{chu2024iclr-image,
  title     = {{Image Clustering via the Principle of Rate Reduction in the Age of Pretrained Models}},
  author    = {Chu, Tianzhe and Tong, Shengbang and Ding, Tianjiao and Dai, Xili and Haeffele, Benjamin David and Vidal, Rene and Ma, Yi},
  booktitle = {International Conference on Learning Representations},
  year      = {2024},
  url       = {https://mlanthology.org/iclr/2024/chu2024iclr-image/}
}