Learning Topic Models by Neighborhood Aggregation

Abstract

Topic models are frequently used in machine learning owing to their high interpretability and modular structure. However, extending a topic model to include a supervisory signal, to incorporate pre-trained word embedding vectors and to include a nonlinear output function is not an easy task because one has to resort to a highly intricate approximate inference procedure. The present paper shows that topic modeling with pre-trained word embedding vectors can be viewed as implementing a neighborhood aggregation algorithm where messages are passed through a network defined over words. From the network view of topic models, nodes correspond to words in a document and edges correspond to either a relationship describing co-occurring words in a document or a relationship describing the same word in the corpus. The network view allows us to extend the model to include supervisory signals, incorporate pre-trained word embedding vectors and include a nonlinear output function in a simple manner. In experiments, we show that our approach outperforms the state-of-the-art supervised Latent Dirichlet Allocation implementation in terms of held-out document classification tasks.

Cite

Text

Hisano. "Learning Topic Models by Neighborhood Aggregation." International Joint Conference on Artificial Intelligence, 2019. doi:10.24963/IJCAI.2019/347

Markdown

[Hisano. "Learning Topic Models by Neighborhood Aggregation." International Joint Conference on Artificial Intelligence, 2019.](https://mlanthology.org/ijcai/2019/hisano2019ijcai-learning/) doi:10.24963/IJCAI.2019/347

BibTeX

@inproceedings{hisano2019ijcai-learning,
  title     = {{Learning Topic Models by Neighborhood Aggregation}},
  author    = {Hisano, Ryohei},
  booktitle = {International Joint Conference on Artificial Intelligence},
  year      = {2019},
  pages     = {2498-2505},
  doi       = {10.24963/IJCAI.2019/347},
  url       = {https://mlanthology.org/ijcai/2019/hisano2019ijcai-learning/}
}