C3-STISR: Scene Text Image Super-Resolution with Triple Clues

Zhao, Minyi; Wang, Miao; Bai, Fan; Li, Bingjia; Wang, Jie; Zhou, Shuigeng

doi:10.24963/IJCAI.2022/238

C3-STISR: Scene Text Image Super-Resolution with Triple Clues

Minyi Zhao, Miao Wang, Fan Bai, Bingjia Li, Jie Wang, Shuigeng Zhou

IJCAI 2022 pp. 1707-1713

doi:10.24963/IJCAI.2022/238 /ijcai/2022/zhao2022ijcai-c/

Abstract

Scene text image super-resolution (STISR) has been regarded as an important pre-processing task for text recognition from low-resolution scene text images. Most recent approaches use the recognizer's feedback as clues to guide super-resolution. However, directly using recognition clue has two problems: 1) Compatibility. It is in the form of probability distribution, has an obvious modal gap with STISR - a pixel-level task; 2) Inaccuracy. it usually contains wrong information, thus will mislead the main task and degrade super-resolution performance. In this paper, we present a novel method C3-STISR that jointly exploits the recognizer's feedback, visual and linguistical information as clues to guide super-resolution. Here, visual clue is from the images of texts predicted by the recognizer, which is informative and more compatible with the STISR task; while linguistical clue is generated by a pre-trained character-level language model, which is able to correct the predicted texts. We design effective extraction and fusion mechanisms for the triple cross-modal clues to generate a comprehensive and unified guidance for super-resolution. Extensive experiments on TextZoom show that C3-STISR outperforms the SOTA methods in fidelity and recognition performance. Code is available in https://github.com/zhaominyiz/C3-STISR.

PDF IJCAI Semantic Scholar

Cite

Text

Zhao et al. "C3-STISR: Scene Text Image Super-Resolution with Triple Clues." International Joint Conference on Artificial Intelligence, 2022. doi:10.24963/IJCAI.2022/238

Markdown

[Zhao et al. "C3-STISR: Scene Text Image Super-Resolution with Triple Clues." International Joint Conference on Artificial Intelligence, 2022.](https://mlanthology.org/ijcai/2022/zhao2022ijcai-c/) doi:10.24963/IJCAI.2022/238

BibTeX

@inproceedings{zhao2022ijcai-c,
  title     = {{C3-STISR: Scene Text Image Super-Resolution with Triple Clues}},
  author    = {Zhao, Minyi and Wang, Miao and Bai, Fan and Li, Bingjia and Wang, Jie and Zhou, Shuigeng},
  booktitle = {International Joint Conference on Artificial Intelligence},
  year      = {2022},
  pages     = {1707-1713},
  doi       = {10.24963/IJCAI.2022/238},
  url       = {https://mlanthology.org/ijcai/2022/zhao2022ijcai-c/}
}