Knowledge Derived from Wikipedia for Computing Semantic Relatedness

Abstract

Wikipedia provides a semantic network for computing semantic relatedness in a more structured fashion than a search engine and with more coverage than WordNet. We present experiments on using Wikipedia for computing semantic relatedness and compare it to WordNet on various benchmarking datasets. Existing relatedness measures perform better using Wikipedia than a baseline given by Google counts, and we show that Wikipedia outperforms WordNet on some datasets. We also address the question whether and how Wikipedia can be integrated into NLP applications as a knowledge base. Including Wikipedia improves the performance of a machine learning based coreference resolution system, indicating that it represents a valuable resource for NLP applications. Finally, we show that our method can be easily used for languages other than English by computing semantic relatedness for a German dataset.

Cite

Text

Ponzetto and Strube. "Knowledge Derived from Wikipedia for Computing Semantic Relatedness." Journal of Artificial Intelligence Research, 2007. doi:10.1613/JAIR.2308

Markdown

[Ponzetto and Strube. "Knowledge Derived from Wikipedia for Computing Semantic Relatedness." Journal of Artificial Intelligence Research, 2007.](https://mlanthology.org/jair/2007/ponzetto2007jair-knowledge/) doi:10.1613/JAIR.2308

BibTeX

@article{ponzetto2007jair-knowledge,
  title     = {{Knowledge Derived from Wikipedia for Computing Semantic Relatedness}},
  author    = {Ponzetto, Simone Paolo and Strube, Michael},
  journal   = {Journal of Artificial Intelligence Research},
  year      = {2007},
  pages     = {181-212},
  doi       = {10.1613/JAIR.2308},
  volume    = {30},
  url       = {https://mlanthology.org/jair/2007/ponzetto2007jair-knowledge/}
}