SCATTER: Selective Context Attentional Scene Text Recognizer

Litman, Ron; Anschel, Oron; Tsiper, Shahar; Litman, Roee; Mazor, Shai; Manmatha, R.

doi:10.1109/CVPR42600.2020.01198

SCATTER: Selective Context Attentional Scene Text Recognizer

Ron Litman, Oron Anschel, Shahar Tsiper, Roee Litman, Shai Mazor, R. Manmatha

CVPR 2020

doi:10.1109/CVPR42600.2020.01198 /cvpr/2020/litman2020cvpr-scatter/

Abstract

Scene Text Recognition (STR), the task of recognizing text against complex image backgrounds, is an active area of research. Current state-of-the-art (SOTA) methods still struggle to recognize text written in arbitrary shapes. In this paper, we introduce a novel architecture for STR, named Selective Context ATtentional Text Recognizer (SCATTER). SCATTER utilizes a stacked block architecture with intermediate supervision during training, that paves the way to successfully train a deep BiLSTM encoder, thus improving the encoding of contextual dependencies. Decoding is done using a two-step 1D attention mechanism. The first attention step re-weights visual features from a CNN backbone together with contextual features computed by a BiLSTM layer. The second attention step, similar to previous papers, treats the features as a sequence and attends to the intra-sequence relationships. Experiments show that the proposed approach surpasses SOTA performance on irregular text recognition benchmarks by 3.7% on average.

PDF CVPR Semantic Scholar

Cite

Text

Litman et al. "SCATTER: Selective Context Attentional Scene Text Recognizer." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020. doi:10.1109/CVPR42600.2020.01198

Markdown

[Litman et al. "SCATTER: Selective Context Attentional Scene Text Recognizer." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020.](https://mlanthology.org/cvpr/2020/litman2020cvpr-scatter/) doi:10.1109/CVPR42600.2020.01198

BibTeX

@inproceedings{litman2020cvpr-scatter,
  title     = {{SCATTER: Selective Context Attentional Scene Text Recognizer}},
  author    = {Litman, Ron and Anschel, Oron and Tsiper, Shahar and Litman, Roee and Mazor, Shai and Manmatha, R.},
  booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  year      = {2020},
  doi       = {10.1109/CVPR42600.2020.01198},
  url       = {https://mlanthology.org/cvpr/2020/litman2020cvpr-scatter/}
}