Online Learning for Latent Dirichlet Allocation

Abstract

We develop an online variational Bayes (VB) algorithm for Latent Dirichlet Allocation (LDA). Online LDA is based on online stochastic optimization with a natural gradient step, which we show converges to a local optimum of the VB objective function. It can handily analyze massive document collections, including those arriving in a stream. We study the performance of online LDA in several ways, including by fitting a 100-topic topic model to 3.3M articles from Wikipedia in a single pass. We demonstrate that online LDA finds topic models as good or better than those found with batch VB, and in a fraction of the time.

Cite

Text

Hoffman et al. "Online Learning for Latent Dirichlet Allocation." Neural Information Processing Systems, 2010.

Markdown

[Hoffman et al. "Online Learning for Latent Dirichlet Allocation." Neural Information Processing Systems, 2010.](https://mlanthology.org/neurips/2010/hoffman2010neurips-online/)

BibTeX

@inproceedings{hoffman2010neurips-online,
  title     = {{Online Learning for Latent Dirichlet Allocation}},
  author    = {Hoffman, Matthew and Bach, Francis R. and Blei, David M.},
  booktitle = {Neural Information Processing Systems},
  year      = {2010},
  pages     = {856-864},
  url       = {https://mlanthology.org/neurips/2010/hoffman2010neurips-online/}
}