Scalable Moment-Based Inference for Latent Dirichlet Allocation

Wang, Chi; Liu, Xueqing; Song, Yanglei; Han, Jiawei

doi:10.1007/978-3-662-44845-8_19

Scalable Moment-Based Inference for Latent Dirichlet Allocation

Chi Wang, Xueqing Liu, Yanglei Song, Jiawei Han

ECML-PKDD 2014 pp. 290-305

doi:10.1007/978-3-662-44845-8_19 /ecmlpkdd/2014/wang2014ecmlpkdd-scalable/

Abstract

Topic models such as Latent Dirichlet Allocation have been useful text analysis methods of wide interest. Recently, moment-based inference with provable performance has been proposed for topic models. Compared with inference algorithms that approximate the maximum likelihood objective, moment-based inference has theoretical guarantee in recovering model parameters. One such inference method is tensor orthogonal decomposition, which requires only mild assumptions for exact recovery of topics. However, it suffers from scalability issue due to creation of dense, high-dimensional tensors. In this work, we propose a speedup technique by leveraging the special structure of the tensors. It is efficient in both time and space, and only requires scanning the corpus twice. It improves over the state-of-the-art inference algorithm by one to three orders of magnitude, while preserving equal inference ability.

PDF ECML-PKDD Semantic Scholar

Cite

Text

Wang et al. "Scalable Moment-Based Inference for Latent Dirichlet Allocation." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2014. doi:10.1007/978-3-662-44845-8_19

Markdown

[Wang et al. "Scalable Moment-Based Inference for Latent Dirichlet Allocation." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2014.](https://mlanthology.org/ecmlpkdd/2014/wang2014ecmlpkdd-scalable/) doi:10.1007/978-3-662-44845-8_19

BibTeX

@inproceedings{wang2014ecmlpkdd-scalable,
  title     = {{Scalable Moment-Based Inference for Latent Dirichlet Allocation}},
  author    = {Wang, Chi and Liu, Xueqing and Song, Yanglei and Han, Jiawei},
  booktitle = {European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases},
  year      = {2014},
  pages     = {290-305},
  doi       = {10.1007/978-3-662-44845-8_19},
  url       = {https://mlanthology.org/ecmlpkdd/2014/wang2014ecmlpkdd-scalable/}
}