Cartesian K-Means

Abstract

A fundamental limitation of quantization techniques like the k-means clustering algorithm is the storage and runtime cost associated with the large numbers of clusters required to keep quantization errors small and model fidelity high. We develop new models with a compositional parameterization of cluster centers, so representational capacity increases super-linearly in the number of parameters. This allows one to effectively quantize data using billions or trillions of centers. We formulate two such models, Orthogonal k-means and Cartesian k-means. They are closely related to one another, to k-means, to methods for binary hash function optimization like ITQ [5], and to Product Quantization for vector quantization [7]. The models are tested on largescale ANN retrieval tasks (1M GIST, 1B SIFT features), and on codebook learning for object recognition (CIFAR-10).

Cite

Text

Norouzi and Fleet. "Cartesian K-Means." Conference on Computer Vision and Pattern Recognition, 2013. doi:10.1109/CVPR.2013.388

Markdown

[Norouzi and Fleet. "Cartesian K-Means." Conference on Computer Vision and Pattern Recognition, 2013.](https://mlanthology.org/cvpr/2013/norouzi2013cvpr-cartesian/) doi:10.1109/CVPR.2013.388

BibTeX

@inproceedings{norouzi2013cvpr-cartesian,
  title     = {{Cartesian K-Means}},
  author    = {Norouzi, Mohammad and Fleet, David J.},
  booktitle = {Conference on Computer Vision and Pattern Recognition},
  year      = {2013},
  doi       = {10.1109/CVPR.2013.388},
  url       = {https://mlanthology.org/cvpr/2013/norouzi2013cvpr-cartesian/}
}