Extending Multi-Sense Word Embedding to Phrases and Sentences for Unsupervised Semantic Applications
Abstract
Most unsupervised NLP models represent each word with a single point or single region in semantic space, while the existing multi-sense word embeddings cannot represent longer word sequences like phrases or sentences. We propose a novel embedding method for a text sequence (a phrase or a sentence) where each sequence is represented by a distinct set of multi-mode codebook embeddings to capture different semantic facets of its meaning. The codebook embeddings can be viewed as the cluster centers which summarize the distribution of possibly co-occurring words in a pre-trained word embedding space. We introduce an end-to-end trainable neural model that directly predicts the set of cluster centers from the input text sequence during test time. Our experiments show that the per-sentence codebook embeddings significantly improve the performances in unsupervised sentence similarity and extractive summarization benchmarks. In phrase similarity experiments, we discover that the multi-facet embeddings provide an interpretable semantic representation but do not outperform the single-facet baseline.
Cite
Text
Chang et al. "Extending Multi-Sense Word Embedding to Phrases and Sentences for Unsupervised Semantic Applications." AAAI Conference on Artificial Intelligence, 2021. doi:10.1609/AAAI.V35I8.16857Markdown
[Chang et al. "Extending Multi-Sense Word Embedding to Phrases and Sentences for Unsupervised Semantic Applications." AAAI Conference on Artificial Intelligence, 2021.](https://mlanthology.org/aaai/2021/chang2021aaai-extending/) doi:10.1609/AAAI.V35I8.16857BibTeX
@inproceedings{chang2021aaai-extending,
title = {{Extending Multi-Sense Word Embedding to Phrases and Sentences for Unsupervised Semantic Applications}},
author = {Chang, Haw-Shiuan and Agrawal, Amol and McCallum, Andrew},
booktitle = {AAAI Conference on Artificial Intelligence},
year = {2021},
pages = {6956-6965},
doi = {10.1609/AAAI.V35I8.16857},
url = {https://mlanthology.org/aaai/2021/chang2021aaai-extending/}
}