The Translation-Invariant Wishart-Dirichlet Process for Clustering Distance Data

Abstract

We present a probabilistic model for clustering of objects represented via pair wise dissimilarities. We propose that even if an underlying vectorial representation exists, it is better to work directly with the dissimilarity matrix hence avoiding unnecessary bias and variance caused by embeddings. By using a Dirichlet process prior we are not obliged to fix the number of clusters in advance. Furthermore, our clustering model is permutation-, scale- and translation-invariant, and it is called the Translation-invariant Wishart Dirichlet(TIWD) process. A highly efficient MCMC sampling algorithm is presented. Experiments show that the TIWD process exhibits several advantages over competing approaches.

Cite

Text

Vogt et al. "The Translation-Invariant Wishart-Dirichlet Process for Clustering Distance Data." International Conference on Machine Learning, 2010.

Markdown

[Vogt et al. "The Translation-Invariant Wishart-Dirichlet Process for Clustering Distance Data." International Conference on Machine Learning, 2010.](https://mlanthology.org/icml/2010/vogt2010icml-translation/)

BibTeX

@inproceedings{vogt2010icml-translation,
  title     = {{The Translation-Invariant Wishart-Dirichlet Process for Clustering Distance Data}},
  author    = {Vogt, Julia E. and Prabhakaran, Sandhya and Fuchs, Thomas J. and Roth, Volker},
  booktitle = {International Conference on Machine Learning},
  year      = {2010},
  pages     = {1111-1118},
  url       = {https://mlanthology.org/icml/2010/vogt2010icml-translation/}
}