Weakly Semi-Supervised Neural Topic Models

Abstract

We consider the problem of topic modeling in a weakly semi-supervised setting. In this scenario, we assume that the user knows a priori a subset of the topics she wants the model to learn and is able to provide a few exemplar documents for those topics. In addition, while each document may typically consist of multiple topics, we do not assume that the user will identify all its topics exhaustively. Recent state-of-the-art topic models such as NVDM, referred to herein as Neural Topic Models (NTMs), fall under the variational autoencoder framework. We extend NTMs to the weakly semi-supervised setting by using informative priors in the training objective. After analyzing the effect of informative priors, we propose a simple modification of the NVDM model using a logit-normal posterior that we show achieves better alignment to user-desired topics versus other NTM models.

Cite

Text

Gemp et al. "Weakly Semi-Supervised Neural Topic Models." ICLR 2019 Workshops: LLD, 2019.

Markdown

[Gemp et al. "Weakly Semi-Supervised Neural Topic Models." ICLR 2019 Workshops: LLD, 2019.](https://mlanthology.org/iclrw/2019/gemp2019iclrw-weakly/)

BibTeX

@inproceedings{gemp2019iclrw-weakly,
  title     = {{Weakly Semi-Supervised Neural Topic Models}},
  author    = {Gemp, Ian and Nallapati, Ramesh and Ding, Ran and Nan, Feng and Xiang, Bing},
  booktitle = {ICLR 2019 Workshops: LLD},
  year      = {2019},
  url       = {https://mlanthology.org/iclrw/2019/gemp2019iclrw-weakly/}
}