All-but-the-Top: Simple and Effective Postprocessing for Word Representations

Abstract

Real-valued word representations have transformed NLP applications; popular examples are word2vec and GloVe, recognized for their ability to capture linguistic regularities. In this paper, we demonstrate a {\em very simple}, and yet counter-intuitive, postprocessing technique -- eliminate the common mean vector and a few top dominating directions from the word vectors -- that renders off-the-shelf representations {\em even stronger}. The postprocessing is empirically validated on a variety of lexical-level intrinsic tasks (word similarity, concept categorization, word analogy) and sentence-level tasks (semantic textural similarity and text classification) on multiple datasets and with a variety of representation methods and hyperparameter choices in multiple languages; in each case, the processed representations are consistently better than the original ones.

Cite

Text

Mu and Viswanath. "All-but-the-Top: Simple and Effective Postprocessing for Word Representations." International Conference on Learning Representations, 2018.

Markdown

[Mu and Viswanath. "All-but-the-Top: Simple and Effective Postprocessing for Word Representations." International Conference on Learning Representations, 2018.](https://mlanthology.org/iclr/2018/mu2018iclr-allbutthetop/)

BibTeX

@inproceedings{mu2018iclr-allbutthetop,
  title     = {{All-but-the-Top: Simple and Effective Postprocessing for Word Representations}},
  author    = {Mu, Jiaqi and Viswanath, Pramod},
  booktitle = {International Conference on Learning Representations},
  year      = {2018},
  url       = {https://mlanthology.org/iclr/2018/mu2018iclr-allbutthetop/}
}