Recursive Recurrent Nets with Attention Modeling for OCR in the Wild

CVPR 2016

doi:10.1109/CVPR.2016.245 /cvpr/2016/lee2016cvpr-recursive/

Abstract

We present recursive recurrent neural networks with attention modeling (R2AM) for lexicon-free optical character recognition in natural scene images. The primary advantages of the proposed method are: (1) use of recursive convolutional neural networks (CNNs), which allow for parametrically efficient and effective image feature extraction; (2) an implicitly learned character-level language model, embodied in a recurrent neural network which avoids the need to use N-grams; and (3) the use of a soft-attention mechanism, allowing the model to selectively exploit image features in a coordinated way, and allowing for end-to-end training within a standard backpropagation framework. We validate our method with state-of-the-art performance on challenging benchmark datasets: Street View Text, IIIT5k, ICDAR and Synth90k.

PDF CVPR Semantic Scholar

Cite

Text

Lee and Osindero. "Recursive Recurrent Nets with Attention Modeling for OCR in the Wild." Conference on Computer Vision and Pattern Recognition, 2016. doi:10.1109/CVPR.2016.245

Markdown

[Lee and Osindero. "Recursive Recurrent Nets with Attention Modeling for OCR in the Wild." Conference on Computer Vision and Pattern Recognition, 2016.](https://mlanthology.org/cvpr/2016/lee2016cvpr-recursive/) doi:10.1109/CVPR.2016.245

BibTeX

@inproceedings{lee2016cvpr-recursive,
  title     = {{Recursive Recurrent Nets with Attention Modeling for OCR in the Wild}},
  author    = {Lee, Chen-Yu and Osindero, Simon},
  booktitle = {Conference on Computer Vision and Pattern Recognition},
  year      = {2016},
  doi       = {10.1109/CVPR.2016.245},
  url       = {https://mlanthology.org/cvpr/2016/lee2016cvpr-recursive/}
}