Dynamic Rank Factor Model for Text Streams

Abstract

We propose a semi-parametric and dynamic rank factor model for topic modeling, capable of (1) discovering topic prevalence over time, and (2) learning contemporary multi-scale dependence structures, providing topic and word correlations as a byproduct. The high-dimensional and time-evolving ordinal/rank observations (such as word counts), after an arbitrary monotone transformation, are well accommodated through an underlying dynamic sparse factor model. The framework naturally admits heavy-tailed innovations, capable of inferring abrupt temporal jumps in the importance of topics. Posterior inference is performed through straightforward Gibbs sampling, based on the forward-filtering backward-sampling algorithm. Moreover, an efficient data subsampling scheme is leveraged to speed up inference on massive datasets. The modeling framework is illustrated on two real datasets: the US State of the Union Address and the JSTOR collection from Science.

Cite

Text

Han et al. "Dynamic Rank Factor Model for Text Streams." Neural Information Processing Systems, 2014.

Markdown

[Han et al. "Dynamic Rank Factor Model for Text Streams." Neural Information Processing Systems, 2014.](https://mlanthology.org/neurips/2014/han2014neurips-dynamic/)

BibTeX

@inproceedings{han2014neurips-dynamic,
  title     = {{Dynamic Rank Factor Model for Text Streams}},
  author    = {Han, Shaobo and Du, Lin and Salazar, Esther and Carin, Lawrence},
  booktitle = {Neural Information Processing Systems},
  year      = {2014},
  pages     = {2663-2671},
  url       = {https://mlanthology.org/neurips/2014/han2014neurips-dynamic/}
}