A Fast and Robust Text Spotter

Abstract

We introduce an algorithm for text detection and localization ("spotting") that is computationally efficient and produces state-of-the-art results. Our system uses multi-channel MSERs to detect a large number of promising regions, then subsamples these regions using a clustering approach. Representatives of region clusters are binarized and then passed on to a deep network. A final line grouping stage forms word-level segments. On the ICDAR 2011 and 2015 benchmarks, our algorithm obtains an F-score of 82% and 83%, respectively, at a computational cost of 1.2 seconds per frame. We also introduce a version that is three times as fast, with only a slight reduction in performance.

Cite

Text

Qin and Manduchi. "A Fast and Robust Text Spotter." IEEE/CVF Winter Conference on Applications of Computer Vision, 2016. doi:10.1109/WACV.2016.7477663

Markdown

[Qin and Manduchi. "A Fast and Robust Text Spotter." IEEE/CVF Winter Conference on Applications of Computer Vision, 2016.](https://mlanthology.org/wacv/2016/qin2016wacv-fast/) doi:10.1109/WACV.2016.7477663

BibTeX

@inproceedings{qin2016wacv-fast,
  title     = {{A Fast and Robust Text Spotter}},
  author    = {Qin, Siyang and Manduchi, Roberto},
  booktitle = {IEEE/CVF Winter Conference on Applications of Computer Vision},
  year      = {2016},
  pages     = {1-8},
  doi       = {10.1109/WACV.2016.7477663},
  url       = {https://mlanthology.org/wacv/2016/qin2016wacv-fast/}
}