Dirichlet Process with Mixed Random Measures: A Nonparametric Topic Model for Labeled Data

Abstract

We describe a nonparametric topic model for labeled data. The model uses a mixture of random measures (MRM) as a base distribution of the Dirichlet process (DP) of the HDP framework, so we call it the DP-MRM. To model labeled data, we define a DP distributed random measure for each label, and the resulting model generates an unbounded number of topics for each label. We apply DP-MRM on single-labeled and multi-labeled corpora of documents and compare the performance on label prediction with MedLDA, LDA-SVM, and Labeled-LDA. We further enhance the model by incorporating ddCRP and modeling multi-labeled images for image segmentation and object labeling, comparing the performance with nCuts and rddCRP.

Cite

Text

Kim et al. "Dirichlet Process with Mixed Random Measures: A Nonparametric Topic Model for Labeled Data." International Conference on Machine Learning, 2012.

Markdown

[Kim et al. "Dirichlet Process with Mixed Random Measures: A Nonparametric Topic Model for Labeled Data." International Conference on Machine Learning, 2012.](https://mlanthology.org/icml/2012/kim2012icml-dirichlet/)

BibTeX

@inproceedings{kim2012icml-dirichlet,
  title     = {{Dirichlet Process with Mixed Random Measures: A Nonparametric Topic Model for Labeled Data}},
  author    = {Kim, Dongwoo and Kim, Suin and Oh, Alice},
  booktitle = {International Conference on Machine Learning},
  year      = {2012},
  url       = {https://mlanthology.org/icml/2012/kim2012icml-dirichlet/}
}