Learning Visual Shape Lexicon for Document Image Content Recognition

Zhu, Guangyu; Yu, Xiaodong; Li, Yi; Doermann, David S.

doi:10.1007/978-3-540-88688-4_55

Learning Visual Shape Lexicon for Document Image Content Recognition

Guangyu Zhu, Xiaodong Yu, Yi Li, David S. Doermann

ECCV 2008 pp. 745-758

doi:10.1007/978-3-540-88688-4_55 /eccv/2008/zhu2008eccv-learning/

Abstract

Developing effective content recognition methods for diverse imagery continues to challenge computer vision researchers. We present a new approach for document image content categorization using a lexicon of shape features. Each lexical word corresponds to a scale and rotation invariant shape feature that is generic enough to be detected repeatably and segmentation free. We learn a concise, structurally indexed shape lexicon from training by clustering and partitioning feature types through graph cuts. We demonstrate our approach on two challenging document image content recognition problems: 1) The classification of 4,500 Web images crawled from Google Image Search into three content categories — pure image, image with text, and document image, and 2) Language identification of 8 languages (Arabic, Chinese, English, Hindi, Japanese, Korean, Russian, and Thai) on a 1,512 complex document image database composed of mixed machine printed text and handwriting. Our approach is capable to handle high intra-class variability and shows results that exceed other state-of-the-art approaches, allowing it to be used as a content recognizer in image indexing and retrieval systems.

PDF ECCV Semantic Scholar

Cite

Text

Zhu et al. "Learning Visual Shape Lexicon for Document Image Content Recognition." European Conference on Computer Vision, 2008. doi:10.1007/978-3-540-88688-4_55

Markdown

[Zhu et al. "Learning Visual Shape Lexicon for Document Image Content Recognition." European Conference on Computer Vision, 2008.](https://mlanthology.org/eccv/2008/zhu2008eccv-learning/) doi:10.1007/978-3-540-88688-4_55

BibTeX

@inproceedings{zhu2008eccv-learning,
  title     = {{Learning Visual Shape Lexicon for Document Image Content Recognition}},
  author    = {Zhu, Guangyu and Yu, Xiaodong and Li, Yi and Doermann, David S.},
  booktitle = {European Conference on Computer Vision},
  year      = {2008},
  pages     = {745-758},
  doi       = {10.1007/978-3-540-88688-4_55},
  url       = {https://mlanthology.org/eccv/2008/zhu2008eccv-learning/}
}