ChartReader: A Unified Framework for Chart Derendering and Comprehension Without Heuristic Rules

Abstract

Charts are a powerful tool for visually conveying complex data, but their comprehension poses a challenge due to the diverse chart types and intricate components. Existing chart comprehension methods suffer from either heuristic rules or an over-reliance on OCR systems, resulting in suboptimal performance. To address these issues, we present ChartReader, a unified framework that seamlessly integrates chart derendering and comprehension tasks. Our approach includes a transformer-based chart component detection module and an extended pre-trained vision-language model for chart-to-X tasks. By learning the rules of charts automatically from annotated datasets, our approach eliminates the need for manual rule-making, reducing effort and enhancing accuracy. We also introduce a data variable replacement technique and extend the input and position embeddings of the pre-trained model for cross-task training. We evaluate ChartReader on Chart-to-Table, ChartQA, and Chart-to-Text tasks, demonstrating its superiority over existing methods. Our proposed framework can significantly reduce the manual effort involved in chart analysis, providing a step towards a universal chart understanding model. Moreover, our approach offers opportunities for plug-and-play integration with mainstream LLMs such as T5 and TaPas, extending their capability to chart comprehension tasks.

Cite

Text

Cheng et al. "ChartReader: A Unified Framework for Chart Derendering and Comprehension Without Heuristic Rules." International Conference on Computer Vision, 2023. doi:10.1109/ICCV51070.2023.02029

Markdown

[Cheng et al. "ChartReader: A Unified Framework for Chart Derendering and Comprehension Without Heuristic Rules." International Conference on Computer Vision, 2023.](https://mlanthology.org/iccv/2023/cheng2023iccv-chartreader/) doi:10.1109/ICCV51070.2023.02029

BibTeX

@inproceedings{cheng2023iccv-chartreader,
  title     = {{ChartReader: A Unified Framework for Chart Derendering and Comprehension Without Heuristic Rules}},
  author    = {Cheng, Zhi-Qi and Dai, Qi and Hauptmann, Alexander G.},
  booktitle = {International Conference on Computer Vision},
  year      = {2023},
  pages     = {22202-22213},
  doi       = {10.1109/ICCV51070.2023.02029},
  url       = {https://mlanthology.org/iccv/2023/cheng2023iccv-chartreader/}
}