Sequential Deformation for Accurate Scene Text Detection

Abstract

Scene text detection has been significantly advanced over recent years, especially after the emergence of deep neural network. However, due to high diversity of scene texts in scale, orientation, shape and aspect ratio, as well as the inherent limitation of convolutional neural network for geometric transformations, to achieve accurate scene text detection is still an open problem. In this paper, we propose a novel sequential deformation method to effectively model the line-shape of scene text. An auxiliary character counting supervision is further introduced to guide the sequential offset prediction. The whole network can be easily optimized through an end-to-end multi-task manner. Extensive experiments are conducted on public scene text detection datasets including ICDAR 2017 MLT, ICDAR 2015, Total-text and SCUT-CTW1500. The experimental results demonstrate that the proposed method has outperformed previous state-of-the-art methods.

Cite

Text

Xiao et al. "Sequential Deformation for Accurate Scene Text Detection." Proceedings of the European Conference on Computer Vision (ECCV), 2020. doi:10.1007/978-3-030-58526-6_7

Markdown

[Xiao et al. "Sequential Deformation for Accurate Scene Text Detection." Proceedings of the European Conference on Computer Vision (ECCV), 2020.](https://mlanthology.org/eccv/2020/xiao2020eccv-sequential/) doi:10.1007/978-3-030-58526-6_7

BibTeX

@inproceedings{xiao2020eccv-sequential,
  title     = {{Sequential Deformation for Accurate Scene Text Detection}},
  author    = {Xiao, Shanyu and Peng, Liangrui and Yan, Ruijie and An, Keyu and Yao, Gang and Min, Jaesik},
  booktitle = {Proceedings of the European Conference on Computer Vision (ECCV)},
  year      = {2020},
  doi       = {10.1007/978-3-030-58526-6_7},
  url       = {https://mlanthology.org/eccv/2020/xiao2020eccv-sequential/}
}