Towards Weakly-Supervised Text Spotting Using a Multi-Task Transformer

Abstract

Text spotting end-to-end methods have recently gained attention in the literature due to the benefits of jointly optimizing the text detection and recognition components. Existing methods usually have a distinct separation between the detection and recognition branches, requiring exact annotations for the two tasks. We introduce TextTranSpotter (TTS), a transformer-based approach for text spotting and the first text spotting framework which may be trained with both fully- and weakly-supervised settings. By learning a single latent representation per word detection, and using a novel loss function based on the Hungarian loss, our method alleviates the need for expensive localization annotations. Trained with only text transcription annotations on real data, our weakly-supervised method achieves competitive performance with previous state-of-the-art fully-supervised methods. When trained in a fully-supervised manner, TextTranSpotter shows state-of-the-art results on multiple benchmarks.

Cite

Text

Kittenplon et al. "Towards Weakly-Supervised Text Spotting Using a Multi-Task Transformer." Conference on Computer Vision and Pattern Recognition, 2022. doi:10.1109/CVPR52688.2022.00456

Markdown

[Kittenplon et al. "Towards Weakly-Supervised Text Spotting Using a Multi-Task Transformer." Conference on Computer Vision and Pattern Recognition, 2022.](https://mlanthology.org/cvpr/2022/kittenplon2022cvpr-weaklysupervised/) doi:10.1109/CVPR52688.2022.00456

BibTeX

@inproceedings{kittenplon2022cvpr-weaklysupervised,
  title     = {{Towards Weakly-Supervised Text Spotting Using a Multi-Task Transformer}},
  author    = {Kittenplon, Yair and Lavi, Inbal and Fogel, Sharon and Bar, Yarin and Manmatha, R. and Perona, Pietro},
  booktitle = {Conference on Computer Vision and Pattern Recognition},
  year      = {2022},
  pages     = {4604-4613},
  doi       = {10.1109/CVPR52688.2022.00456},
  url       = {https://mlanthology.org/cvpr/2022/kittenplon2022cvpr-weaklysupervised/}
}