Topic Model Diagnostics: Assessing Domain Relevance via Topical Alignment

Jason Chuang, Sonal Gupta, Christopher Manning, Jeffrey Heer

ICML 2013 pp. 612-620

/icml/2013/chuang2013icml-topic/

Abstract

The use of topic models to analyze domain-specific texts often requires manual validation of the latent topics to ensure they are meaningful. We introduce a framework to support large-scale assessment of topical relevance. We measure the correspondence between a set of latent topics and a set of reference concepts to quantify four types of topical misalignment: junk, fused, missing, and repeated topics. Our analysis compares 10,000 topic model variants to 200 expert-provided domain concepts, and demonstrates how our framework can inform choices of model parameters, inference algorithms, and intrinsic measures of topical quality.

PDF ICML Semantic Scholar

Cite

Text

Chuang et al. "Topic Model Diagnostics: Assessing Domain Relevance via Topical Alignment." International Conference on Machine Learning, 2013.

Markdown

[Chuang et al. "Topic Model Diagnostics: Assessing Domain Relevance via Topical Alignment." International Conference on Machine Learning, 2013.](https://mlanthology.org/icml/2013/chuang2013icml-topic/)

BibTeX

@inproceedings{chuang2013icml-topic,
  title     = {{Topic Model Diagnostics: Assessing Domain Relevance via Topical Alignment}},
  author    = {Chuang, Jason and Gupta, Sonal and Manning, Christopher and Heer, Jeffrey},
  booktitle = {International Conference on Machine Learning},
  year      = {2013},
  pages     = {612-620},
  volume    = {28},
  url       = {https://mlanthology.org/icml/2013/chuang2013icml-topic/}
}