Bayesian Hierarchical Clustering

Heller, Katherine A.; Ghahramani, Zoubin

doi:10.1145/1102351.1102389

Bayesian Hierarchical Clustering

Katherine A. Heller, Zoubin Ghahramani

ICML 2005 pp. 297-304

doi:10.1145/1102351.1102389 /icml/2005/heller2005icml-bayesian/

Abstract

We present a novel algorithm for agglomerative hierarchical clustering based on evaluating marginal likelihoods of a probabilistic model. This algorithm has several advantages over traditional distance-based agglomerative clustering algorithms. (1) It defines a probabilistic model of the data which can be used to compute the predictive distribution of a test point and the probability of it belonging to any of the existing clusters in the tree. (2) It uses a model-based criterion to decide on merging clusters rather than an ad-hoc distance metric. (3) Bayesian hypothesis testing is used to decide which merges are advantageous and to output the recommended depth of the tree. (4) The algorithm can be interpreted as a novel fast bottom-up approximate inference method for a Dirichlet process (i.e. countably infinite) mixture model (DPM). It provides a new lower bound on the marginal likelihood of a DPM by summing over exponentially many clusterings of the data in polynomial time. We describe procedures for learning the model hyperpa-rameters, computing the predictive distribution, and extensions to the algorithm. Experimental results on synthetic and real-world data sets demonstrate useful properties of the algorithm.

PDF ICML Semantic Scholar

Cite

Text

Heller and Ghahramani. "Bayesian Hierarchical Clustering." International Conference on Machine Learning, 2005. doi:10.1145/1102351.1102389

Markdown

[Heller and Ghahramani. "Bayesian Hierarchical Clustering." International Conference on Machine Learning, 2005.](https://mlanthology.org/icml/2005/heller2005icml-bayesian/) doi:10.1145/1102351.1102389

BibTeX

@inproceedings{heller2005icml-bayesian,
  title     = {{Bayesian Hierarchical Clustering}},
  author    = {Heller, Katherine A. and Ghahramani, Zoubin},
  booktitle = {International Conference on Machine Learning},
  year      = {2005},
  pages     = {297-304},
  doi       = {10.1145/1102351.1102389},
  url       = {https://mlanthology.org/icml/2005/heller2005icml-bayesian/}
}