Image Clustering via the Principle of Rate Reduction in the Age of Pretrained Models

Tianzhe Chu, Shengbang Tong, Tianjiao Ding, Xili Dai, Benjamin David Haeffele, Rene Vidal, Yi Ma

ICLR 2024

/iclr/2024/chu2024iclr-image/

Abstract

The advent of large pre-trained models has brought about a paradigm shift in both visual representation learning and natural language processing. However, clustering unlabeled images, as a fundamental and classic machine learning problem, still lacks an effective solution, particularly for large-scale datasets. In this paper, we propose a novel image clustering pipeline that leverages the powerful feature representation of large pre-trained models such as CLIP and cluster images effectively and efficiently at scale. We first developed a novel algorithm to estimate the number of clusters in a given dataset. We then show that the pre-trained features are significantly more structured by further optimizing the rate reduction objective. The resulting features may significantly improve the clustering accuracy, e.g., from 57\% to 66\% on ImageNet-1k. Furthermore, by leveraging CLIP's multimodality bridge between image and text, we develop a simple yet effective self-labeling algorithm that produces meaningful text labels for the clusters. Through extensive experiments, we show that our pipeline works well on standard datasets such as CIFAR-10, CIFAR-100, and ImageNet-1k. It also extends to datasets without predefined labels, such as LAION-Aesthetics and WikiArts.

PDF ICLR Semantic Scholar

Cite

Text

Chu et al. "Image Clustering via the Principle of Rate Reduction in the Age of Pretrained Models." International Conference on Learning Representations, 2024.

Markdown

[Chu et al. "Image Clustering via the Principle of Rate Reduction in the Age of Pretrained Models." International Conference on Learning Representations, 2024.](https://mlanthology.org/iclr/2024/chu2024iclr-image/)

BibTeX

@inproceedings{chu2024iclr-image,
  title     = {{Image Clustering via the Principle of Rate Reduction in the Age of Pretrained Models}},
  author    = {Chu, Tianzhe and Tong, Shengbang and Ding, Tianjiao and Dai, Xili and Haeffele, Benjamin David and Vidal, Rene and Ma, Yi},
  booktitle = {International Conference on Learning Representations},
  year      = {2024},
  url       = {https://mlanthology.org/iclr/2024/chu2024iclr-image/}
}