Weighted Symbols-Based Edit Distance for String-Structured Image Classification

Abstract

As an alternative to vector representations, a recent trend in image classification suggests to integrate additional structural information in the description of images in order to enhance classification accuracy. Rather than being represented in a p-dimensional space, images can typically be encoded in the form of strings, trees or graphs and are usually compared either by computing suited metrics such as the (string or tree)-edit distance, or by testing subgraph isomorphism. In this paper, we propose a new way for representing images in the form of strings whose symbols are weighted according to a TF-IDF-based weighting scheme, inspired from information retrieval. To be able to handle such real-valued weights, we first introduce a new weighted string edit distance that keeps the properties of a distance. In particular, we prove that the triangle inequality is preserved which allows the computation of the edit distance in quadratic time by dynamic programming. We show on an image classification task that our new weighted edit distance not only significantly outperforms the standard edit distance but also seems very competitive in comparison with standard histogram distances-based approaches.

Cite

Text

Barat et al. "Weighted Symbols-Based Edit Distance for String-Structured Image Classification." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2010. doi:10.1007/978-3-642-15880-3_11

Markdown

[Barat et al. "Weighted Symbols-Based Edit Distance for String-Structured Image Classification." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2010.](https://mlanthology.org/ecmlpkdd/2010/barat2010ecmlpkdd-weighted/) doi:10.1007/978-3-642-15880-3_11

BibTeX

@inproceedings{barat2010ecmlpkdd-weighted,
  title     = {{Weighted Symbols-Based Edit Distance for String-Structured Image Classification}},
  author    = {Barat, Cécile and Ducottet, Christophe and Fromont, Élisa and Legrand, Anne-Claire and Sebban, Marc},
  booktitle = {European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases},
  year      = {2010},
  pages     = {72-86},
  doi       = {10.1007/978-3-642-15880-3_11},
  url       = {https://mlanthology.org/ecmlpkdd/2010/barat2010ecmlpkdd-weighted/}
}