Coresets for Ordered Weighted Clustering
Abstract
We design coresets for Ordered k-Median, a generalization of classical clustering problems such as k-Median and k-Center. Its objective function is defined via the Ordered Weighted Averaging (OWA) paradigm of Yager (1988), where data points are weighted according to a predefined weight vector, but in order of their contribution to the objective (distance from the centers). A powerful data-reduction technique, called a coreset, is to summarize a point set $X$ in $\mathbb{R}^d$ into a small (weighted) point set $X’$, such that for every set of $k$ potential centers, the objective value of the coreset $X’$ approximates that of $X$ within factor $1\pm \epsilon$. When there are multiple objectives (weights), the above standard coreset might have limited usefulness, whereas in a simultaneous coreset, the above approximation holds for all weights (in addition to all centers). Our main result is a construction of a simultaneous coreset of size $O_{\epsilon, d}(k^2 \log^2 |X|)$ for Ordered k-Median. We validate our algorithm on a real geographical data set, and we find our coreset leads to a massive speedup of clustering computations, while maintaining high accuracy for a range of weights.
Cite
Text
Braverman et al. "Coresets for Ordered Weighted Clustering." International Conference on Machine Learning, 2019.Markdown
[Braverman et al. "Coresets for Ordered Weighted Clustering." International Conference on Machine Learning, 2019.](https://mlanthology.org/icml/2019/braverman2019icml-coresets/)BibTeX
@inproceedings{braverman2019icml-coresets,
title = {{Coresets for Ordered Weighted Clustering}},
author = {Braverman, Vladimir and Jiang, Shaofeng H.-C. and Krauthgamer, Robert and Wu, Xuan},
booktitle = {International Conference on Machine Learning},
year = {2019},
pages = {744-753},
volume = {97},
url = {https://mlanthology.org/icml/2019/braverman2019icml-coresets/}
}