Variance Reduced K-Means Clustering

Abstract

It is challenging to perform k-means clustering on a large scale dataset efficiently. One of the reasons is that k-means needs to scan a batch of training data to update the cluster centers at every iteration, which is time-consuming. In the paper, we propose a variance reduced k-mean VRKM, which outperforms the state-of-the-art method, and obtain 4× speedup for large-scale clustering. The source code is available on https://github.com/YaweiZhao/VRKM_sofia-ml.

Cite

Text

Zhao et al. "Variance Reduced K-Means Clustering." AAAI Conference on Artificial Intelligence, 2018. doi:10.1609/AAAI.V32I1.12135

Markdown

[Zhao et al. "Variance Reduced K-Means Clustering." AAAI Conference on Artificial Intelligence, 2018.](https://mlanthology.org/aaai/2018/zhao2018aaai-variance/) doi:10.1609/AAAI.V32I1.12135

BibTeX

@inproceedings{zhao2018aaai-variance,
  title     = {{Variance Reduced K-Means Clustering}},
  author    = {Zhao, Yawei and Ming, Yuewei and Liu, Xinwang and Zhu, En and Yin, Jianping},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {2018},
  pages     = {8187-8188},
  doi       = {10.1609/AAAI.V32I1.12135},
  url       = {https://mlanthology.org/aaai/2018/zhao2018aaai-variance/}
}