Finding Meaningful Cluster Structure Amidst Background Noise
Abstract
We consider efficient clustering algorithm under data clusterability assumptions with added noise. In contrast with most literature on this topic that considers either the adversarial noise setting or some noise generative model, we examine a realistically motivated setting in which the only restriction about the noisy part of the data is that it does not create significantly large “clusters”. Another aspect in which our model deviates from common approaches is that we stipulate the goals of clustering as discovering meaningful cluster structure in the data, rather than optimizing some objective (clustering cost). We introduce efficient algorithms that discover and cluster every subset of the data with meaningful structure and lack of structure on its complement (under some formal definition of such “structure”). Notably, the success of our algorithms does not depend on any upper bound on the fraction of noisy data. We complement our results by showing that when either the notions of structure or the noise requirements are relaxed, no such results are possible.
Cite
Text
Kushagra et al. "Finding Meaningful Cluster Structure Amidst Background Noise." International Conference on Algorithmic Learning Theory, 2016. doi:10.1007/978-3-319-46379-7_23Markdown
[Kushagra et al. "Finding Meaningful Cluster Structure Amidst Background Noise." International Conference on Algorithmic Learning Theory, 2016.](https://mlanthology.org/alt/2016/kushagra2016alt-finding/) doi:10.1007/978-3-319-46379-7_23BibTeX
@inproceedings{kushagra2016alt-finding,
title = {{Finding Meaningful Cluster Structure Amidst Background Noise}},
author = {Kushagra, Shrinu and Samadi, Samira and Ben-David, Shai},
booktitle = {International Conference on Algorithmic Learning Theory},
year = {2016},
pages = {339-354},
doi = {10.1007/978-3-319-46379-7_23},
url = {https://mlanthology.org/alt/2016/kushagra2016alt-finding/}
}