Information Extraction from Text Regions with Complex Tabular Structure

Abstract

Recent innovations have improved layout analysis of document images, significantly improving our ability to identify text and non-text regions. However, extracting information from within text regions remains quite challenging because the text region may have a complex structure. In this paper, we present a new dataset with complex text structure, and propose new methods to robustly retrieve information from the complex text region.

Cite

Text

Zhang et al. "Information Extraction from Text Regions with Complex Tabular Structure." NeurIPS 2019 Workshops: Document_Intelligence, 2019.

Markdown

[Zhang et al. "Information Extraction from Text Regions with Complex Tabular Structure." NeurIPS 2019 Workshops: Document_Intelligence, 2019.](https://mlanthology.org/neuripsw/2019/zhang2019neuripsw-information/)

BibTeX

@inproceedings{zhang2019neuripsw-information,
  title     = {{Information Extraction from Text Regions with Complex Tabular Structure}},
  author    = {Zhang, Kaixuan and Shen, Zejiang and Zhou, Jie and Dell, Melissa},
  booktitle = {NeurIPS 2019 Workshops: Document_Intelligence},
  year      = {2019},
  url       = {https://mlanthology.org/neuripsw/2019/zhang2019neuripsw-information/}
}