Bayesian Progressive Deep Topic Model with Knowledge Informed Textual Data Coarsening Process
Abstract
Deep topic models have shown an impressive ability to extract multi-layer document latent representations and discover hierarchical semantically meaningful topics.However, most deep topic models are limited to the single-step generative process, despite the fact that the progressive generative process has achieved impressive performance in modeling image data. To this end, in this paper, we propose a novel progressive deep topic model that consists of a knowledge-informed textural data coarsening process and a corresponding progressive generative model. The former is used to build multi-level observations ranging from concrete to abstract, while the latter is used to generate more concrete observations gradually. Additionally, we incorporate a graph-enhanced decoder to capture the semantic relationships among words at different levels of observation. Furthermore, we perform a theoretical analysis of the proposed model based on the principle of information theory and show how it can alleviate the well-known "latent variable collapse" problem. Finally, extensive experiments demonstrate that our proposed model effectively improves the ability of deep topic models, resulting in higher-quality latent document representations and topics.
Cite
Text
Duan et al. "Bayesian Progressive Deep Topic Model with Knowledge Informed Textual Data Coarsening Process." International Conference on Machine Learning, 2023.Markdown
[Duan et al. "Bayesian Progressive Deep Topic Model with Knowledge Informed Textual Data Coarsening Process." International Conference on Machine Learning, 2023.](https://mlanthology.org/icml/2023/duan2023icml-bayesian/)BibTeX
@inproceedings{duan2023icml-bayesian,
title = {{Bayesian Progressive Deep Topic Model with Knowledge Informed Textual Data Coarsening Process}},
author = {Duan, Zhibin and Liu, Xinyang and Su, Yudi and Xu, Yishi and Chen, Bo and Zhou, Mingyuan},
booktitle = {International Conference on Machine Learning},
year = {2023},
pages = {8731-8746},
volume = {202},
url = {https://mlanthology.org/icml/2023/duan2023icml-bayesian/}
}