Refining Word Representations by Manifold Learning
Abstract
Pre-trained distributed word representations have been proven useful in various natural language processing (NLP) tasks. However, the effect of words’ geometric structure on word representations has not been carefully studied yet. The existing word representations methods underestimate the words whose distances are close in the Euclidean space, while overestimating words with a much greater distance. In this paper, we propose a word vector refinement model to correct the pre-trained word embedding, which brings the similarity of words in Euclidean space closer to word semantics by using manifold learning. This approach is theoretically founded in the metric recovery paradigm. Our word representations have been evaluated on a variety of lexical-level intrinsic tasks (semantic relatedness, semantic similarity) and the experimental results show that the proposed model outperforms several popular word representations approaches.
Cite
Text
Chu et al. "Refining Word Representations by Manifold Learning." International Joint Conference on Artificial Intelligence, 2019. doi:10.24963/IJCAI.2019/749Markdown
[Chu et al. "Refining Word Representations by Manifold Learning." International Joint Conference on Artificial Intelligence, 2019.](https://mlanthology.org/ijcai/2019/chu2019ijcai-refining/) doi:10.24963/IJCAI.2019/749BibTeX
@inproceedings{chu2019ijcai-refining,
title = {{Refining Word Representations by Manifold Learning}},
author = {Chu, Yonghe and Lin, Hongfei and Yang, Liang and Diao, Yufeng and Zhang, Shaowu and Fan, Xiaochao},
booktitle = {International Joint Conference on Artificial Intelligence},
year = {2019},
pages = {5394-5400},
doi = {10.24963/IJCAI.2019/749},
url = {https://mlanthology.org/ijcai/2019/chu2019ijcai-refining/}
}