Exact Acceleration of K-Means++ and K-Means

Abstract

K-Means++ and its distributed variant K-Means|| have become de facto tools for selecting the initial seeds of K-means. While alternatives have been developed, the effectiveness, ease of implementation,and theoretical grounding of the K-means++ and || methods have made them difficult to "best" from a holistic perspective. We focus on using triangle inequality based pruning methods to accelerate both of these algorithms to yield comparable or better run-time without sacrificing any of the benefits of these approaches. For both algorithms we are able to reduce distance computations by over 500×. For K-means++ this results in up to a 17×speedup in run-time and a551×speedup for K-means||. We achieve this with simple, but carefully chosen, modifications to known techniques which makes it easy to integrate our approach into existing implementations of these algorithms.

Cite

Text

Raff. "Exact Acceleration of K-Means++ and K-Means." International Joint Conference on Artificial Intelligence, 2021. doi:10.24963/IJCAI.2021/403

Markdown

[Raff. "Exact Acceleration of K-Means++ and K-Means." International Joint Conference on Artificial Intelligence, 2021.](https://mlanthology.org/ijcai/2021/raff2021ijcai-exact/) doi:10.24963/IJCAI.2021/403

BibTeX

@inproceedings{raff2021ijcai-exact,
  title     = {{Exact Acceleration of K-Means++ and K-Means}},
  author    = {Raff, Edward},
  booktitle = {International Joint Conference on Artificial Intelligence},
  year      = {2021},
  pages     = {2928-2935},
  doi       = {10.24963/IJCAI.2021/403},
  url       = {https://mlanthology.org/ijcai/2021/raff2021ijcai-exact/}
}