Improving Distributed Word Representation and Topic Model by Word-Topic Mixture Model
Abstract
We propose a Word-Topic Mixture(WTM) model to improve word representation and topic model simultaneously. Firstly, it introduces the initial external word embeddings into the Topical Word Embeddings(TWE) model based on Latent Dirichlet Allocation(LDA) model to learn word embeddings and topic vectors. Then the results learned from TWE are integrated in the LDA by defining the probability distribution of topic vectors-word embeddings according to the idea of latent feature model with LDA (LFLDA), meanwhile minimizing the KL divergence of the new topic-word distribution function and the original one. The experimental results prove that the WTM model performs better on word representation and topic detection compared with some state-of-the-art models.
Cite
Text
Fu et al. "Improving Distributed Word Representation and Topic Model by Word-Topic Mixture Model." Proceedings of The 8th Asian Conference on Machine Learning, 2016.Markdown
[Fu et al. "Improving Distributed Word Representation and Topic Model by Word-Topic Mixture Model." Proceedings of The 8th Asian Conference on Machine Learning, 2016.](https://mlanthology.org/acml/2016/fu2016acml-improving/)BibTeX
@inproceedings{fu2016acml-improving,
title = {{Improving Distributed Word Representation and Topic Model by Word-Topic Mixture Model}},
author = {Fu, Xianghua and Wang, Ting and Li, Jing and Yu, Chong and Liu, Wangwang},
booktitle = {Proceedings of The 8th Asian Conference on Machine Learning},
year = {2016},
pages = {190-205},
volume = {63},
url = {https://mlanthology.org/acml/2016/fu2016acml-improving/}
}