FASText: Efficient Unconstrained Scene Text Detector

Abstract

We propose a novel easy-to-implement stroke detector based on an efficient pixel intensity comparison to surrounding pixels. Stroke-specific keypoints are efficiently detected and text fragments are subsequently extracted by local thresholding guided by keypoint properties. Classification based on effectively calculated features then eliminates non-text regions. The stroke-specific keypoints produce 2 times less region segmentations and still detect 25% more characters than the commonly exploited MSER detector and the process is 4 times faster. After a novel efficient classification step, the number of regions is reduced to 7 times less than the standard method and is still almost 3 times faster. All stages of the proposed pipeline are scale- and rotation-invariant and support a wide variety of scripts (Latin, Hebrew, Chinese, etc.) and fonts. When the proposed detector is plugged into a scene text localization and recognition pipeline, a state-of-the-art text localization accuracy is maintained whilst the processing time is significantly reduced.

Cite

Text

Busta et al. "FASText: Efficient Unconstrained Scene Text Detector." International Conference on Computer Vision, 2015. doi:10.1109/ICCV.2015.143

Markdown

[Busta et al. "FASText: Efficient Unconstrained Scene Text Detector." International Conference on Computer Vision, 2015.](https://mlanthology.org/iccv/2015/busta2015iccv-fastext/) doi:10.1109/ICCV.2015.143

BibTeX

@inproceedings{busta2015iccv-fastext,
  title     = {{FASText: Efficient Unconstrained Scene Text Detector}},
  author    = {Busta, Michal and Neumann, Lukas and Matas, Jiri},
  booktitle = {International Conference on Computer Vision},
  year      = {2015},
  doi       = {10.1109/ICCV.2015.143},
  url       = {https://mlanthology.org/iccv/2015/busta2015iccv-fastext/}
}