Large Scale Sparse Clustering
Abstract
Large-scale clustering has found wide applications in many fields and received much attention in recent years. However, most existing large-scale clustering methods can only achieve mediocre performance, because they are sensitive to the unavoidable presence of noise in the large-scale data. To address this challenging problem, we thus propose a large-scale sparse clustering (LSSC) algorithm. In this paper, we choose a two-step optimization strategy for large-scale sparse clustering: 1) k-means clustering over the large-scale data to obtain the initial clustering results; 2) clustering refinement over the initial results by developing a spare coding algorithm. To guarantee the scalability of the second step for large-scale data, we also utilize nonlinear approximation and dimension reduction techniques to speed up the sparse coding algorithm. Experimental results on both synthetic and real-world datasets demonstrate the promising performance of our LSSC algorithm. PDF
Cite
Text
Zhang and Lu. "Large Scale Sparse Clustering." International Joint Conference on Artificial Intelligence, 2016.Markdown
[Zhang and Lu. "Large Scale Sparse Clustering." International Joint Conference on Artificial Intelligence, 2016.](https://mlanthology.org/ijcai/2016/zhang2016ijcai-large/)BibTeX
@inproceedings{zhang2016ijcai-large,
title = {{Large Scale Sparse Clustering}},
author = {Zhang, Ruqi and Lu, Zhiwu},
booktitle = {International Joint Conference on Artificial Intelligence},
year = {2016},
pages = {2336-2342},
url = {https://mlanthology.org/ijcai/2016/zhang2016ijcai-large/}
}