Finding Planted Partitions in Nearly Linear Time Using Arrested Spectral Clustering

Abstract

We describe an algorithm for clustering using a similarity graph. Thealgorithm (a) runs in O(n log^3 n + m \log n) time on graphs withn vertices and m edges, and (b) with high probability, finds all``large enough'' clusters in a random graph generated according to theplanted partition model. We provide lower bounds that imply that our``large enough'' constraint cannot be improved much, even using acomputationally unbounded algorithm. We describe some experimentsrunning the algorithm and a few related algorithms on random graphswith partitions generated using a Chinese Restaurant Processes, andsome results of applying the algorithm to cluster DBLP titles.

Cite

Text

Bshouty and Long. "Finding Planted Partitions in Nearly Linear Time Using Arrested Spectral Clustering." International Conference on Machine Learning, 2010.

Markdown

[Bshouty and Long. "Finding Planted Partitions in Nearly Linear Time Using Arrested Spectral Clustering." International Conference on Machine Learning, 2010.](https://mlanthology.org/icml/2010/bshouty2010icml-finding/)

BibTeX

@inproceedings{bshouty2010icml-finding,
  title     = {{Finding Planted Partitions in Nearly Linear Time Using Arrested Spectral Clustering}},
  author    = {Bshouty, Nader H. and Long, Philip M.},
  booktitle = {International Conference on Machine Learning},
  year      = {2010},
  pages     = {135-142},
  url       = {https://mlanthology.org/icml/2010/bshouty2010icml-finding/}
}