Interpreting and Extending Classical Agglomerative Clustering Algorithms Using a Model-Based Approach

Abstract

We present two results which arise from a model-based approach to hierarchical agglomerative clustering. First, we show formally that the common heuristic agglomerative clustering algorithms – single-link, complete-link, groupaverage, and Ward’s method – are each equivalent to a hierarchical model-based method. This interpretation gives a theoretical explanation of the empirical behavior of these algorithms, as well as a principled approach to resolving practical issues, such as number of clusters or the choice of method. Second, we show how a model-based approach can be used to extend these basic agglomerative algorithms. We introduce adjusted complete-link, Mahalanobis-link, and line-link as variants of the classical agglomerative methods, and demonstrate their utility. 1.

Cite

Text

Kamvar et al. "Interpreting and Extending Classical Agglomerative Clustering Algorithms Using a Model-Based Approach." International Conference on Machine Learning, 2002.

Markdown

[Kamvar et al. "Interpreting and Extending Classical Agglomerative Clustering Algorithms Using a Model-Based Approach." International Conference on Machine Learning, 2002.](https://mlanthology.org/icml/2002/kamvar2002icml-interpreting/)

BibTeX

@inproceedings{kamvar2002icml-interpreting,
  title     = {{Interpreting and Extending Classical Agglomerative Clustering Algorithms Using a Model-Based Approach}},
  author    = {Kamvar, Sepandar D. and Klein, Dan and Manning, Christopher D.},
  booktitle = {International Conference on Machine Learning},
  year      = {2002},
  pages     = {283-290},
  url       = {https://mlanthology.org/icml/2002/kamvar2002icml-interpreting/}
}