Coresets for Ordered Weighted Clustering

Abstract

We design coresets for Ordered k-Median, a generalization of classical clustering problems such as k-Median and k-Center. Its objective function is defined via the Ordered Weighted Averaging (OWA) paradigm of Yager (1988), where data points are weighted according to a predefined weight vector, but in order of their contribution to the objective (distance from the centers). A powerful data-reduction technique, called a coreset, is to summarize a point set $X$ in $\mathbb{R}^d$ into a small (weighted) point set $X’$, such that for every set of $k$ potential centers, the objective value of the coreset $X’$ approximates that of $X$ within factor $1\pm \epsilon$. When there are multiple objectives (weights), the above standard coreset might have limited usefulness, whereas in a simultaneous coreset, the above approximation holds for all weights (in addition to all centers). Our main result is a construction of a simultaneous coreset of size $O_{\epsilon, d}(k^2 \log^2 |X|)$ for Ordered k-Median. We validate our algorithm on a real geographical data set, and we find our coreset leads to a massive speedup of clustering computations, while maintaining high accuracy for a range of weights.

Cite

Text

Braverman et al. "Coresets for Ordered Weighted Clustering." International Conference on Machine Learning, 2019.

Markdown

[Braverman et al. "Coresets for Ordered Weighted Clustering." International Conference on Machine Learning, 2019.](https://mlanthology.org/icml/2019/braverman2019icml-coresets/)

BibTeX

@inproceedings{braverman2019icml-coresets,
  title     = {{Coresets for Ordered Weighted Clustering}},
  author    = {Braverman, Vladimir and Jiang, Shaofeng H.-C. and Krauthgamer, Robert and Wu, Xuan},
  booktitle = {International Conference on Machine Learning},
  year      = {2019},
  pages     = {744-753},
  volume    = {97},
  url       = {https://mlanthology.org/icml/2019/braverman2019icml-coresets/}
}