Encouraging Sparsity in Neural Topic Modeling with Non-Mean-Field Inference

Chen, Jiayao; Wang, Rui; He, Jueying; Li, Mark Junjie

doi:10.1007/978-3-031-43421-1_9

Encouraging Sparsity in Neural Topic Modeling with Non-Mean-Field Inference

Jiayao Chen, Rui Wang, Jueying He, Mark Junjie Li

ECML-PKDD 2023 pp. 142-158

doi:10.1007/978-3-031-43421-1_9 /ecmlpkdd/2023/chen2023ecmlpkdd-encouraging/

Abstract

Topic modeling is a popular method for discovering semantic information from textual data, with latent Dirichlet allocation (LDA) being a representative model. Recently, researchers have explored the use of variational autoencoders (VAE) to improve the performance of LDA. However, there remain two major limitations: (1) the Dirichlet prior is inadequate to extract precise semantic information in VAE-LDA models, as it introduces a trade-off between the topic quality and the sparsity of representations; (2) new variants of VAE-LDA models with auxiliary variables generally ignore the correlation between latent variables in the inference process due to the Mean-Field assumption. To address these issues, in this paper, we propose a Sparsity Reinforced and Non-Mean-Field Topic Model ( SpareNTM ) with a bank of auxiliary Bernoulli variables in the generative process of LDA to further model the sparsity of document representations. Thus individual documents are forced to focus on a subset of topics by a corresponding Bernoulli topic selector. Then, instead of applying the mean-field assumption for the posterior approximation, we take full advantage of VAE to realize a non-mean-field approximation, which succeeds in preserving the connection of latent variables. Experiment results on three datasets (20NewsGroup, Wikitext-103, and SearchSnippets) show that our model outperforms recent topic models in terms of both topic quality and sparsity.

PDF ECML-PKDD Semantic Scholar

Cite

Text

Chen et al. "Encouraging Sparsity in Neural Topic Modeling with Non-Mean-Field Inference." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2023. doi:10.1007/978-3-031-43421-1_9

Markdown

[Chen et al. "Encouraging Sparsity in Neural Topic Modeling with Non-Mean-Field Inference." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2023.](https://mlanthology.org/ecmlpkdd/2023/chen2023ecmlpkdd-encouraging/) doi:10.1007/978-3-031-43421-1_9

BibTeX

@inproceedings{chen2023ecmlpkdd-encouraging,
  title     = {{Encouraging Sparsity in Neural Topic Modeling with Non-Mean-Field Inference}},
  author    = {Chen, Jiayao and Wang, Rui and He, Jueying and Li, Mark Junjie},
  booktitle = {European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases},
  year      = {2023},
  pages     = {142-158},
  doi       = {10.1007/978-3-031-43421-1_9},
  url       = {https://mlanthology.org/ecmlpkdd/2023/chen2023ecmlpkdd-encouraging/}
}