Latent Dirichlet Allocation

David M. Blei, Andrew Y. Ng, Michael I. Jordan

NeurIPS 2001 pp. 601-608

/neurips/2001/blei2001neurips-latent/

Abstract

We propose a generative model for text and other collections of dis(cid:173) crete data that generalizes or improves on several previous models including naive Bayes/unigram, mixture of unigrams [6], and Hof(cid:173) mann's aspect model, also known as probabilistic latent semantic indexing (pLSI) [3]. In the context of text modeling, our model posits that each document is generated as a mixture of topics, where the continuous-valued mixture proportions are distributed as a latent Dirichlet random variable. Inference and learning are carried out efficiently via variational algorithms. We present em(cid:173) pirical results on applications of this model to problems in text modeling, collaborative filtering, and text classification.

PDF NeurIPS Semantic Scholar

Cite

Text

Blei et al. "Latent Dirichlet Allocation." Neural Information Processing Systems, 2001.

Markdown

[Blei et al. "Latent Dirichlet Allocation." Neural Information Processing Systems, 2001.](https://mlanthology.org/neurips/2001/blei2001neurips-latent/)

BibTeX

@inproceedings{blei2001neurips-latent,
  title     = {{Latent Dirichlet Allocation}},
  author    = {Blei, David M. and Ng, Andrew Y. and Jordan, Michael I.},
  booktitle = {Neural Information Processing Systems},
  year      = {2001},
  pages     = {601-608},
  url       = {https://mlanthology.org/neurips/2001/blei2001neurips-latent/}
}