DocUNet: Document Image Unwarping via a Stacked U-Net

Ma, Ke; Shu, Zhixin; Bai, Xue; Wang, Jue; Samaras, Dimitris

doi:10.1109/CVPR.2018.00494

DocUNet: Document Image Unwarping via a Stacked U-Net

Ke Ma, Zhixin Shu, Xue Bai, Jue Wang, Dimitris Samaras

CVPR 2018

doi:10.1109/CVPR.2018.00494 /cvpr/2018/ma2018cvpr-docunet/

Abstract

Capturing document images is a common way for digitizing and recording physical documents due to the ubiquitousness of mobile cameras. To make text recognition easier, it is often desirable to digitally flatten a document image when the physical document sheet is folded or curved. In this paper, we develop the first learning-based method to achieve this goal. We propose a stacked U-Net with intermediate supervision to directly predict the forward mapping from a distorted image to its rectified version. Because large-scale real-world data with ground truth deformation is difficult to obtain, we create a synthetic dataset with approximately 100 thousand images by warping non-distorted document images. The network is trained on this dataset with various data augmentations to improve its generalization ability. We further create a comprehensive benchmark that covers various real-world conditions. We evaluate the proposed model quantitatively and qualitatively on the proposed benchmark, and compare it with previous non-learning-based methods.

PDF CVPR Semantic Scholar

Cite

Text

Ma et al. "DocUNet: Document Image Unwarping via a Stacked U-Net." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018. doi:10.1109/CVPR.2018.00494

Markdown

[Ma et al. "DocUNet: Document Image Unwarping via a Stacked U-Net." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018.](https://mlanthology.org/cvpr/2018/ma2018cvpr-docunet/) doi:10.1109/CVPR.2018.00494

BibTeX

@inproceedings{ma2018cvpr-docunet,
  title     = {{DocUNet: Document Image Unwarping via a Stacked U-Net}},
  author    = {Ma, Ke and Shu, Zhixin and Bai, Xue and Wang, Jue and Samaras, Dimitris},
  booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  year      = {2018},
  doi       = {10.1109/CVPR.2018.00494},
  url       = {https://mlanthology.org/cvpr/2018/ma2018cvpr-docunet/}
}