Automatic Construction and Evaluation of a Large Semantically Enriched Wikipedia

Abstract

The hyperlink structure of Wikipedia constitutes a key resource for many Natural Language Processing tasks and applications, as it provides several million semantic annotations of entities in context. Yet only a small fraction of mentions across the entire Wikipedia corpus is linked. In this paper we present the automatic construction and evaluation of a Semantically Enriched Wikipedia in which the overall number of linked mentions has been more than tripled solely by exploiting the structure of Wikipedia itself and the wide-coverage sense inventory of BabelNet. As a result we obtain a sense-annotated corpus with more than 200 million annotations of over 4 million different concepts and named entities. We then show that our corpus leads to competitive results on multiple tasks, such as Entity Linking and Word Similarity. PDF

Cite

Text

Raganato et al. "Automatic Construction and Evaluation of a Large Semantically Enriched Wikipedia." International Joint Conference on Artificial Intelligence, 2016.

Markdown

[Raganato et al. "Automatic Construction and Evaluation of a Large Semantically Enriched Wikipedia." International Joint Conference on Artificial Intelligence, 2016.](https://mlanthology.org/ijcai/2016/raganato2016ijcai-automatic/)

BibTeX

@inproceedings{raganato2016ijcai-automatic,
  title     = {{Automatic Construction and Evaluation of a Large Semantically Enriched Wikipedia}},
  author    = {Raganato, Alessandro and Bovi, Claudio Delli and Navigli, Roberto},
  booktitle = {International Joint Conference on Artificial Intelligence},
  year      = {2016},
  pages     = {2894-2900},
  url       = {https://mlanthology.org/ijcai/2016/raganato2016ijcai-automatic/}
}