Bayesian Progressive Deep Topic Model with Knowledge Informed Textual Data Coarsening Process

Abstract

Deep topic models have shown an impressive ability to extract multi-layer document latent representations and discover hierarchical semantically meaningful topics.However, most deep topic models are limited to the single-step generative process, despite the fact that the progressive generative process has achieved impressive performance in modeling image data. To this end, in this paper, we propose a novel progressive deep topic model that consists of a knowledge-informed textural data coarsening process and a corresponding progressive generative model. The former is used to build multi-level observations ranging from concrete to abstract, while the latter is used to generate more concrete observations gradually. Additionally, we incorporate a graph-enhanced decoder to capture the semantic relationships among words at different levels of observation. Furthermore, we perform a theoretical analysis of the proposed model based on the principle of information theory and show how it can alleviate the well-known "latent variable collapse" problem. Finally, extensive experiments demonstrate that our proposed model effectively improves the ability of deep topic models, resulting in higher-quality latent document representations and topics.

Cite

Text

Duan et al. "Bayesian Progressive Deep Topic Model with Knowledge Informed Textual Data Coarsening Process." International Conference on Machine Learning, 2023.

Markdown

[Duan et al. "Bayesian Progressive Deep Topic Model with Knowledge Informed Textual Data Coarsening Process." International Conference on Machine Learning, 2023.](https://mlanthology.org/icml/2023/duan2023icml-bayesian/)

BibTeX

@inproceedings{duan2023icml-bayesian,
  title     = {{Bayesian Progressive Deep Topic Model with Knowledge Informed Textual Data Coarsening Process}},
  author    = {Duan, Zhibin and Liu, Xinyang and Su, Yudi and Xu, Yishi and Chen, Bo and Zhou, Mingyuan},
  booktitle = {International Conference on Machine Learning},
  year      = {2023},
  pages     = {8731-8746},
  volume    = {202},
  url       = {https://mlanthology.org/icml/2023/duan2023icml-bayesian/}
}