Imposing Category Trees onto Word-Embeddings Using a Geometric Construction

Abstract

We present a novel method to precisely impose tree-structured category information onto word-embeddings, resulting in ball embeddings in higher dimensional spaces (N-balls for short). Inclusion relations among N-balls implicitly encode subordinate relations among categories. The similarity measurement in terms of the cosine function is enriched by category information. Using a geometric construction method instead of back-propagation, we create large N-ball embeddings that satisfy two conditions: (1) category trees are precisely imposed onto word embeddings at zero energy cost; (2) pre-trained word embeddings are well preserved. A new benchmark data set is created for validating the category of unknown words. Experiments show that N-ball embeddings, carrying category information, significantly outperform word embeddings in the test of nearest neighborhoods, and demonstrate surprisingly good performance in validating categories of unknown words. Source codes and data-sets are free for public access \url{https://github.com/gnodisnait/nball4tree.git} and \url{https://github.com/gnodisnait/bp94nball.git}.

Cite

Text

Dong et al. "Imposing Category Trees onto Word-Embeddings Using a Geometric Construction." International Conference on Learning Representations, 2019.

Markdown

[Dong et al. "Imposing Category Trees onto Word-Embeddings Using a Geometric Construction." International Conference on Learning Representations, 2019.](https://mlanthology.org/iclr/2019/dong2019iclr-imposing/)

BibTeX

@inproceedings{dong2019iclr-imposing,
  title     = {{Imposing Category Trees onto Word-Embeddings Using a Geometric Construction}},
  author    = {Dong, Tiansi and Bauckhage, Chrisitan and Jin, Hailong and Li, Juanzi and Cremers, Olaf and Speicher, Daniel and Cremers, Armin B. and Zimmermann, Joerg},
  booktitle = {International Conference on Learning Representations},
  year      = {2019},
  url       = {https://mlanthology.org/iclr/2019/dong2019iclr-imposing/}
}