Evaluation Methods for Topic Models

Wallach, Hanna M.; Murray, Iain; Salakhutdinov, Ruslan; Mimno, David M.

doi:10.1145/1553374.1553515

Evaluation Methods for Topic Models

Hanna M. Wallach, Iain Murray, Ruslan Salakhutdinov, David M. Mimno

ICML 2009 pp. 1105-1112

doi:10.1145/1553374.1553515 /icml/2009/wallach2009icml-evaluation/

Abstract

A natural evaluation metric for statistical topic models is the probability of held-out documents given a trained model. While exact computation of this probability is intractable, several estimators for this probability have been used in the topic modeling literature, including the harmonic mean method and empirical likelihood method. In this paper, we demonstrate experimentally that commonly-used methods are unlikely to accurately estimate the probability of held-out documents, and propose two alternative methods that are both accurate and efficient.

PDF ICML Semantic Scholar

Cite

Text

Wallach et al. "Evaluation Methods for Topic Models." International Conference on Machine Learning, 2009. doi:10.1145/1553374.1553515

Markdown

[Wallach et al. "Evaluation Methods for Topic Models." International Conference on Machine Learning, 2009.](https://mlanthology.org/icml/2009/wallach2009icml-evaluation/) doi:10.1145/1553374.1553515

BibTeX

@inproceedings{wallach2009icml-evaluation,
  title     = {{Evaluation Methods for Topic Models}},
  author    = {Wallach, Hanna M. and Murray, Iain and Salakhutdinov, Ruslan and Mimno, David M.},
  booktitle = {International Conference on Machine Learning},
  year      = {2009},
  pages     = {1105-1112},
  doi       = {10.1145/1553374.1553515},
  url       = {https://mlanthology.org/icml/2009/wallach2009icml-evaluation/}
}