Parallel Spectral Clustering

Song, Yangqiu; Chen, WenYen; Bai, Hongjie; Lin, Chih-Jen; Chang, Edward Y.

doi:10.1007/978-3-540-87481-2_25

Parallel Spectral Clustering

Yangqiu Song, WenYen Chen, Hongjie Bai, Chih-Jen Lin, Edward Y. Chang

ECML-PKDD 2008 pp. 374-389

doi:10.1007/978-3-540-87481-2_25 /ecmlpkdd/2008/song2008ecmlpkdd-parallel/

Abstract

Spectral clustering algorithm has been shown to be more effective in finding clusters than most traditional algorithms. However, spectral clustering suffers from a scalability problem in both memory use and computational time when a dataset size is large. To perform clustering on large datasets, we propose to parallelize both memory use and computation on distributed computers. Through an empirical study on a large document dataset of 193,844 data instances and a large photo dataset of 637,137, we demonstrate that our parallel algorithm can effectively alleviate the scalability problem.

PDF ECML-PKDD Semantic Scholar

Cite

Text

Song et al. "Parallel Spectral Clustering." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2008. doi:10.1007/978-3-540-87481-2_25

Markdown

[Song et al. "Parallel Spectral Clustering." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2008.](https://mlanthology.org/ecmlpkdd/2008/song2008ecmlpkdd-parallel/) doi:10.1007/978-3-540-87481-2_25

BibTeX

@inproceedings{song2008ecmlpkdd-parallel,
  title     = {{Parallel Spectral Clustering}},
  author    = {Song, Yangqiu and Chen, WenYen and Bai, Hongjie and Lin, Chih-Jen and Chang, Edward Y.},
  booktitle = {European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases},
  year      = {2008},
  pages     = {374-389},
  doi       = {10.1007/978-3-540-87481-2_25},
  url       = {https://mlanthology.org/ecmlpkdd/2008/song2008ecmlpkdd-parallel/}
}