Using Semantic Distance in a Content-Based Heterogeneous Information Retrieval System
Abstract
This paper brings two contributions in relation with the semantic heterogeneous (documents composed of texts and images) information retrieval: (1) A new context-based semantic distance measure for textual data, and (2) an IR system providing a conceptual and an automatic indexing of documents by considering their heterogeneous content using a domain specific ontology. The proposed semantic distance measure is used in order to automatically fuzzify our domain ontology. The two proposals are evaluated and very interesting results were obtained. Using our semantic distance measure, we obtained a correlation ratio of 0.89 with human judgments on a set of words pairs which led our measure to outperform all the other measures. Preliminary combination results obtained on a specialized corpus of web pages are also reported.
Cite
Text
El Sayed et al. "Using Semantic Distance in a Content-Based Heterogeneous Information Retrieval System." European Conference on Machine Learning, 2007. doi:10.1007/978-3-540-68416-9_18Markdown
[El Sayed et al. "Using Semantic Distance in a Content-Based Heterogeneous Information Retrieval System." European Conference on Machine Learning, 2007.](https://mlanthology.org/ecmlpkdd/2007/sayed2007ecml-using/) doi:10.1007/978-3-540-68416-9_18BibTeX
@inproceedings{sayed2007ecml-using,
title = {{Using Semantic Distance in a Content-Based Heterogeneous Information Retrieval System}},
author = {El Sayed, Ahmad and Hacid, Hakim and Zighed, Djamel A.},
booktitle = {European Conference on Machine Learning},
year = {2007},
pages = {224-237},
doi = {10.1007/978-3-540-68416-9_18},
url = {https://mlanthology.org/ecmlpkdd/2007/sayed2007ecml-using/}
}