Learning to Detect Scene Text Using a Higher-Order MRF with Belief Propagation

Abstract

Detecting text in natural 3D scenes is a challenging problem due to background clutter and photometric/gemetric variations of scene text. Most prior systems adopt approaches based on deterministic rules, lacking a systematic and scalable framework. In this paper, we present a parts-based approach for 3D scene text detection using a higher-order MRF model. The higher-order structure is used to capture the spatial-feature relations among multiple parts in scene text. The use of higher-order structure and the feature-dependent potential function represents significant departure from the conventional pairwise MRF, which has been successfully applied in several low-level applications. We further develop a variational approximation method, in the form of belief propagation, for inference in the higher-order model. Our experiments using the ICDAR'03 benchmark showed promising results in detecting scene text with significant geometric variations, background clutter on planar surfaces or non-planar surfaces with limited angles.

Cite

Text

Zhang and Chang. "Learning to Detect Scene Text Using a Higher-Order MRF with Belief Propagation." IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2004. doi:10.1109/CVPR.2004.387

Markdown

[Zhang and Chang. "Learning to Detect Scene Text Using a Higher-Order MRF with Belief Propagation." IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2004.](https://mlanthology.org/cvpr/2004/zhang2004cvpr-learning/) doi:10.1109/CVPR.2004.387

BibTeX

@inproceedings{zhang2004cvpr-learning,
  title     = {{Learning to Detect Scene Text Using a Higher-Order MRF with Belief Propagation}},
  author    = {Zhang, Dong-Qing and Chang, Shih-Fu},
  booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  year      = {2004},
  pages     = {101},
  doi       = {10.1109/CVPR.2004.387},
  url       = {https://mlanthology.org/cvpr/2004/zhang2004cvpr-learning/}
}