Towards Knowledge-Driven Annotation

Abstract

While the Web of data is attracting increasing interest and rapidly growing in size, the major support of information on the surface Web are still multimedia documents. Semantic annotation of texts is one of the main processes that are intended to facilitate meaning-based information exchange between computational agents. However, such annotation faces several challenges such as the heterogeneity of natural language expressions, the heterogeneity of documents structure and context dependencies. While a broad range of annotation approaches rely mainly or partly on the target textual context to disambiguate the extracted entities, in this paper we present an approach that relies mainly on formalized-knowledge expressed in RDF datasets to categorize and disambiguate noun phrases. In the proposed method, we represent the reference knowledge bases as co-occurrence matrices and the disambiguation problem as a 0-1 Integer Linear Programming (ILP) problem. The proposed approach is unsupervised and can be ported to any RDF knowledge base. The system implementing this approach, called KODA, shows very promising results w.r.t. state-of-the-art annotation tools in cross-domain experimentations.

Cite

Text

Mrabet et al. "Towards Knowledge-Driven Annotation." AAAI Conference on Artificial Intelligence, 2015. doi:10.1609/AAAI.V29I1.9521

Markdown

[Mrabet et al. "Towards Knowledge-Driven Annotation." AAAI Conference on Artificial Intelligence, 2015.](https://mlanthology.org/aaai/2015/mrabet2015aaai-knowledge/) doi:10.1609/AAAI.V29I1.9521

BibTeX

@inproceedings{mrabet2015aaai-knowledge,
  title     = {{Towards Knowledge-Driven Annotation}},
  author    = {Mrabet, Yassine and Gardent, Claire and Foulonneau, Muriel and Simperl, Elena and Ras, Eric},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {2015},
  pages     = {2425-2431},
  doi       = {10.1609/AAAI.V29I1.9521},
  url       = {https://mlanthology.org/aaai/2015/mrabet2015aaai-knowledge/}
}