An Alternative Prior Process for Nonparametric Bayesian Clustering

Abstract

Prior distributions play a crucial role in Bayesian approaches to clustering. Two commonly-used prior distributions are the Dirichlet and Pitman-Yor processes. In this paper, we investigate the predictive probabilities that underlie these processes, and the implicit “rich-get-richer” characteristic of the resulting partitions. We explore an alternative prior for nonparametric Bayesian clustering, the uniform process, for applications where the “rich-get-richer” property is undesirable. We also explore the cost of this new process: partitions are no longer exchangeable with respect to the ordering of variables. We present new asymptotic and simulation-based results for the clustering characteristics of the uniform process and compare these with known results for the Dirichlet and Pitman-Yor processes. Finally, we compare performance on a real document clustering task, demonstrating the practical advantage of the uniform process despite its lack of exchangeability over orderings.

Cite

Text

Wallach et al. "An Alternative Prior Process for Nonparametric Bayesian Clustering." Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, 2010.

Markdown

[Wallach et al. "An Alternative Prior Process for Nonparametric Bayesian Clustering." Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, 2010.](https://mlanthology.org/aistats/2010/wallach2010aistats-alternative/)

BibTeX

@inproceedings{wallach2010aistats-alternative,
  title     = {{An Alternative Prior Process for Nonparametric Bayesian Clustering}},
  author    = {Wallach, Hanna and Jensen, Shane and Dicker, Lee and Heller, Katherine},
  booktitle = {Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics},
  year      = {2010},
  pages     = {892-899},
  volume    = {9},
  url       = {https://mlanthology.org/aistats/2010/wallach2010aistats-alternative/}
}