MuLaN: Multilingual Label propagatioN for Word Sense Disambiguation

Abstract

The knowledge acquisition bottleneck strongly affects the creation of multilingual sense-annotated data, hence limiting the power of supervised systems when applied to multilingual Word Sense Disambiguation. In this paper, we propose a semi-supervised approach based upon a novel label propagation scheme, which, by jointly leveraging contextualized word embeddings and the multilingual information enclosed in a knowledge base, projects sense labels from a high-resource language, i.e., English, to lower-resourced ones. Backed by several experiments, we provide empirical evidence that our automatically created datasets are of a higher quality than those generated by other competitors and lead a supervised model to achieve state-of-the-art performances in all multilingual Word Sense Disambiguation tasks. We make our datasets available for research purposes at https://github.com/SapienzaNLP/mulan.

Cite

Text

Barba et al. "MuLaN: Multilingual Label propagatioN for Word Sense Disambiguation." International Joint Conference on Artificial Intelligence, 2020. doi:10.24963/IJCAI.2020/531

Markdown

[Barba et al. "MuLaN: Multilingual Label propagatioN for Word Sense Disambiguation." International Joint Conference on Artificial Intelligence, 2020.](https://mlanthology.org/ijcai/2020/barba2020ijcai-mulan/) doi:10.24963/IJCAI.2020/531

BibTeX

@inproceedings{barba2020ijcai-mulan,
  title     = {{MuLaN: Multilingual Label propagatioN for Word Sense Disambiguation}},
  author    = {Barba, Edoardo and Procopio, Luigi and Campolungo, Niccolò and Pasini, Tommaso and Navigli, Roberto},
  booktitle = {International Joint Conference on Artificial Intelligence},
  year      = {2020},
  pages     = {3837-3844},
  doi       = {10.24963/IJCAI.2020/531},
  url       = {https://mlanthology.org/ijcai/2020/barba2020ijcai-mulan/}
}