A Framework for Statistical Clustering with a Constant Time Approximation Algorithms for K-Median Clustering

Abstract

We consider a framework in which the clustering algorithm gets as input a sample generated i.i.d by some unknown arbitrary distribution, and has to output a clustering of the full domain set, that is evaluated with respect to the underlying distribution. We provide general conditions on clustering problems that imply the existence of sampling based clusterings that approximate the optimal clustering. We show that the K -median clustering, as well as the Vector Quantization problem, satisfy these conditions. In particular our results apply to the sampling – based approximate clustering scenario. As a corollary, we get a sampling-based algorithm for the K -median clustering problem that finds an almost optimal set of centers in time depending only on the confidence and accuracy parameters of the approximation, but independent of the input size. Furthermore, in the Euclidean input case, the running time of our algorithm is independent of the Euclidean dimension.

Cite

Text

Ben-David. "A Framework for Statistical Clustering with a Constant Time Approximation Algorithms for K-Median Clustering." Annual Conference on Computational Learning Theory, 2004. doi:10.1007/978-3-540-27819-1_29

Markdown

[Ben-David. "A Framework for Statistical Clustering with a Constant Time Approximation Algorithms for K-Median Clustering." Annual Conference on Computational Learning Theory, 2004.](https://mlanthology.org/colt/2004/bendavid2004colt-framework/) doi:10.1007/978-3-540-27819-1_29

BibTeX

@inproceedings{bendavid2004colt-framework,
  title     = {{A Framework for Statistical Clustering with a Constant Time Approximation Algorithms for K-Median Clustering}},
  author    = {Ben-David, Shai},
  booktitle = {Annual Conference on Computational Learning Theory},
  year      = {2004},
  pages     = {415-426},
  doi       = {10.1007/978-3-540-27819-1_29},
  url       = {https://mlanthology.org/colt/2004/bendavid2004colt-framework/}
}