On Estimation and Selection for Topic Models

Abstract

This article describes posterior maximization for topic models, identifying computational and conceptual gains from inference under a non-standard parametrization. We then show that fitted parameters can be used as the basis for a novel approach to marginal likelihood estimation, via block-diagonal approximation to the information matrix, that facilitates choosing the number of latent topics. This likelihood-based model selection is complemented with a goodness-of-fit analysis built around estimated residual dispersion. Examples are provided to illustrate model selection as well as to compare our estimation against standard alternative techniques.

Cite

Text

Taddy. "On Estimation and Selection for Topic Models." Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics, 2012.

Markdown

[Taddy. "On Estimation and Selection for Topic Models." Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics, 2012.](https://mlanthology.org/aistats/2012/taddy2012aistats-estimation/)

BibTeX

@inproceedings{taddy2012aistats-estimation,
  title     = {{On Estimation and Selection for Topic Models}},
  author    = {Taddy, Matt},
  booktitle = {Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics},
  year      = {2012},
  pages     = {1184-1193},
  volume    = {22},
  url       = {https://mlanthology.org/aistats/2012/taddy2012aistats-estimation/}
}