Adaptive Text Recognition Through Visual Matching

Zhang, Chuhan; Gupta, Ankush; Zisserman, Andrew

doi:10.1007/978-3-030-58517-4_4

Adaptive Text Recognition Through Visual Matching

Chuhan Zhang, Ankush Gupta, Andrew Zisserman

ECCV 2020

doi:10.1007/978-3-030-58517-4_4 /eccv/2020/zhang2020eccv-adaptive/

Abstract

This work addresses the problems of generalization and flexibility for text recognition in documents. We introduce a new model that exploits the repetitive nature of characters in languages, and decouples the visual decoding and linguistic modelling stages through intermediate representations in the form of similarity maps. By doing this, we turn text recognition into a visual matching problem, thereby achieving generalization in appearance and flexibility in classes. We evaluate the model on both synthetic and real datasets across different languages and alphabets, and show that it can handle challenges that traditional architectures are unable to solve without expensive re-training, including: (i) it can change the number of classes simply by changing the exemplars; and (ii) it can generalize to novel languages and characters (not in the training data) simply by providing a new glyph exemplar set. In essence, it is able to carry out one-shot sequence recognition. We also demonstrate that the model can generalize to unseen fonts without requiring new exemplars from them.

PDF ECCV Semantic Scholar

Cite

Text

Zhang et al. "Adaptive Text Recognition Through Visual Matching." Proceedings of the European Conference on Computer Vision (ECCV), 2020. doi:10.1007/978-3-030-58517-4_4

Markdown

[Zhang et al. "Adaptive Text Recognition Through Visual Matching." Proceedings of the European Conference on Computer Vision (ECCV), 2020.](https://mlanthology.org/eccv/2020/zhang2020eccv-adaptive/) doi:10.1007/978-3-030-58517-4_4

BibTeX

@inproceedings{zhang2020eccv-adaptive,
  title     = {{Adaptive Text Recognition Through Visual Matching}},
  author    = {Zhang, Chuhan and Gupta, Ankush and Zisserman, Andrew},
  booktitle = {Proceedings of the European Conference on Computer Vision (ECCV)},
  year      = {2020},
  doi       = {10.1007/978-3-030-58517-4_4},
  url       = {https://mlanthology.org/eccv/2020/zhang2020eccv-adaptive/}
}