Cascade Evaluation of Clustering Algorithms

Abstract

This paper is about the evaluation of the results of clustering algorithms, and the comparison of such algorithms. We propose a new method based on the enrichment of a set of independent labeled datasets by the results of clustering, and the use of a supervised method to evaluate the interest of adding such new information to the datasets. We thus adapt the cascade generalization [1] paradigm in the case where we combine an unsupervised and a supervised learner. We also consider the case where independent supervised learnings are performed on the different groups of data objects created by the clustering [2]. We then conduct experiments using different supervised algorithms to compare various clustering algorithms. And we thus show that our proposed method exhibits a coherent behavior, pointing out, for example, that the algorithms based on the use of complex probabilistic models outperform algorithms based on the use of simpler models.

Cite

Text

Candillier et al. "Cascade Evaluation of Clustering Algorithms." European Conference on Machine Learning, 2006. doi:10.1007/11871842_54

Markdown

[Candillier et al. "Cascade Evaluation of Clustering Algorithms." European Conference on Machine Learning, 2006.](https://mlanthology.org/ecmlpkdd/2006/candillier2006ecml-cascade/) doi:10.1007/11871842_54

BibTeX

@inproceedings{candillier2006ecml-cascade,
  title     = {{Cascade Evaluation of Clustering Algorithms}},
  author    = {Candillier, Laurent and Tellier, Isabelle and Torre, Fabien and Bousquet, Olivier},
  booktitle = {European Conference on Machine Learning},
  year      = {2006},
  pages     = {574-581},
  doi       = {10.1007/11871842_54},
  url       = {https://mlanthology.org/ecmlpkdd/2006/candillier2006ecml-cascade/}
}