Enriching Word Embeddings with a Regressor Instead of Labeled Corpora

Mohamed Abdalla, Magnus Sahlgren, Graeme Hirst

AAAI 2019 pp. 6188-6195

doi:10.1609/AAAI.V33I01.33016188 /aaai/2019/abdalla2019aaai-enriching/

Abstract

We propose a novel method for enriching word-embeddings without the need of a labeled corpus. Instead, we show that relying on a regressor – trained with a small lexicon to predict pseudo-labels – significantly improves performance over current techniques that rely on human-derived sentence-level labels for an entire corpora. Our approach enables enrichment for corpora that have no labels (such as Wikipedia). Exploring the utility of this general approach in both sentiment and non-sentiment-focused tasks, we show how enriching embeddings, for both Twitter and Wikipedia-based embeddings, provide notable improvements in performance for: binary sentiment classification, SemEval Tasks, embedding analogy task, and, document classification. Importantly, our approach is notably better and more generalizable than other state-of-the-art approaches for enriching both labeled and unlabeled corpora.

PDF AAAI Semantic Scholar

Cite

Text

Abdalla et al. "Enriching Word Embeddings with a Regressor Instead of Labeled Corpora." AAAI Conference on Artificial Intelligence, 2019. doi:10.1609/AAAI.V33I01.33016188

Markdown

[Abdalla et al. "Enriching Word Embeddings with a Regressor Instead of Labeled Corpora." AAAI Conference on Artificial Intelligence, 2019.](https://mlanthology.org/aaai/2019/abdalla2019aaai-enriching/) doi:10.1609/AAAI.V33I01.33016188

BibTeX

@inproceedings{abdalla2019aaai-enriching,
  title     = {{Enriching Word Embeddings with a Regressor Instead of Labeled Corpora}},
  author    = {Abdalla, Mohamed and Sahlgren, Magnus and Hirst, Graeme},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {2019},
  pages     = {6188-6195},
  doi       = {10.1609/AAAI.V33I01.33016188},
  url       = {https://mlanthology.org/aaai/2019/abdalla2019aaai-enriching/}
}