A Language-Based Similarity Measure
Abstract
This paper presents an unified framework for the definition of similarity measures for various formalisms (attribute-value, first order logic...). The underlying idea is that the similarity between two objects does not depend only on the attribute values of the objects, but more especially on the set of the potentially relevant definitions of concepts for the problem considered. In our framework, the user defines a language with a grammar to specify the similarity measure. Each term of the language represents a property of the objects. The similarity between two objects is the probability that these two objects both satisfy or both reject simultaneously the properties of the given language. When this probability is not computable, we use a stochastic generation procedure to approximate it. This measure can be applied for both clustering and classification tasks. The empirical evaluation on common classification problems shows a very good accuracy.
Cite
Text
Martin and Moal. "A Language-Based Similarity Measure." European Conference on Machine Learning, 2001. doi:10.1007/3-540-44795-4_29Markdown
[Martin and Moal. "A Language-Based Similarity Measure." European Conference on Machine Learning, 2001.](https://mlanthology.org/ecmlpkdd/2001/martin2001ecml-languagebased/) doi:10.1007/3-540-44795-4_29BibTeX
@inproceedings{martin2001ecml-languagebased,
title = {{A Language-Based Similarity Measure}},
author = {Martin, Lionel and Moal, Frédéric},
booktitle = {European Conference on Machine Learning},
year = {2001},
pages = {336-347},
doi = {10.1007/3-540-44795-4_29},
url = {https://mlanthology.org/ecmlpkdd/2001/martin2001ecml-languagebased/}
}