Geometry-Aware Scene Text Detection with Instance Transformation Network

Abstract

Localizing text in the wild is challenging in the situations of complicated geometric layout of the targets like random orientation and large aspect ratio. In this paper, we propose a geometry-aware modeling approach tailored for scene text representation with an end-to-end learning scheme. In our approach, a novel Instance Transformation Network (ITN) is presented to learn the geometry-aware representation encoding the unique geometric configurations of scene text instances with in-network transformation embedding, resulting in a robust and elegant framework to detect words or text lines at one pass. An end-to-end multi-task learning strategy with transformation regression, text/non-text classification and coordinate regression is adopted in the ITN. Experiments on the benchmark datasets demonstrate the effectiveness of the proposed approach in detecting scene text in various geometric configurations.

Cite

Text

Wang et al. "Geometry-Aware Scene Text Detection with Instance Transformation Network." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018. doi:10.1109/CVPR.2018.00150

Markdown

[Wang et al. "Geometry-Aware Scene Text Detection with Instance Transformation Network." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018.](https://mlanthology.org/cvpr/2018/wang2018cvpr-geometryaware/) doi:10.1109/CVPR.2018.00150

BibTeX

@inproceedings{wang2018cvpr-geometryaware,
  title     = {{Geometry-Aware Scene Text Detection with Instance Transformation Network}},
  author    = {Wang, Fangfang and Zhao, Liming and Li, Xi and Wang, Xinchao and Tao, Dacheng},
  booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  year      = {2018},
  doi       = {10.1109/CVPR.2018.00150},
  url       = {https://mlanthology.org/cvpr/2018/wang2018cvpr-geometryaware/}
}