Distributed $k$-Means and $k$-Median Clustering on General Topologies
Abstract
This paper provides new algorithms for distributed clustering for two popular center-based objectives, $k$-median and $k$-means. These algorithms have provable guarantees and improve communication complexity over existing approaches. Following a classic approach in clustering by \cite{har2004coresets}, we reduce the problem of finding a clustering with low cost to the problem of finding a `coreset' of small size. We provide a distributed method for constructing a global coreset which improves over the previous methods by reducing the communication complexity, and which works over general communication topologies. We provide experimental evidence for this approach on both synthetic and real data sets.
Cite
Text
Balcan et al. "Distributed $k$-Means and $k$-Median Clustering on General Topologies." Neural Information Processing Systems, 2013.Markdown
[Balcan et al. "Distributed $k$-Means and $k$-Median Clustering on General Topologies." Neural Information Processing Systems, 2013.](https://mlanthology.org/neurips/2013/balcan2013neurips-distributed/)BibTeX
@inproceedings{balcan2013neurips-distributed,
title = {{Distributed $k$-Means and $k$-Median Clustering on General Topologies}},
author = {Balcan, Maria-Florina F and Ehrlich, Steven and Liang, Yingyu},
booktitle = {Neural Information Processing Systems},
year = {2013},
pages = {1995-2003},
url = {https://mlanthology.org/neurips/2013/balcan2013neurips-distributed/}
}