Figure Captioning with Relation Maps for Reasoning

Chen, Charles; Zhang, Ruiyi; Koh, Eunyee; Kim, Sungchul; Cohen, Scott; Rossi, Ryan

Figure Captioning with Relation Maps for Reasoning

Charles Chen, Ruiyi Zhang, Eunyee Koh, Sungchul Kim, Scott Cohen, Ryan Rossi

WACV 2020

/wacv/2020/chen2020wacv-figure/

Abstract

Figures, such as line plots, pie charts, bar charts, are widely used to convey important information in a concise format. In this work, we investigate the problem of figure caption generation where the goal is to automatically generate a natural language description for a given figure. While natural image captioning has been studied extensively, figure captioning has received relatively little attention and remains a challenging problem. A successful solution to this task has many potential applications, such as: 1) adding captions to the output of a visualization tool; 2) summarizing documents with a number of figures with or without proper captions; 3) improving user experience by allowing figure content to be accessible to those with visual impairment. To solve this problem, we collect a dataset FigCAP for testing the capability of generating captions, and propose a captioning framework with novel attention models. In order to solve the exposure bias issue, we further train the captioning model with sequence-level policy based on reinforcement learning, which directly optimizes evaluation metrics. Extensive experiments show that our proposed models outperform strong image captioning baselines, thus demonstrating a significant potential for automatic generating captions for figures.

PDF WACV Semantic Scholar

Cite

Text

Chen et al. "Figure Captioning with Relation Maps for Reasoning." Winter Conference on Applications of Computer Vision, 2020.

Markdown

[Chen et al. "Figure Captioning with Relation Maps for Reasoning." Winter Conference on Applications of Computer Vision, 2020.](https://mlanthology.org/wacv/2020/chen2020wacv-figure/)

BibTeX

@inproceedings{chen2020wacv-figure,
  title     = {{Figure Captioning with Relation Maps for Reasoning}},
  author    = {Chen, Charles and Zhang, Ruiyi and Koh, Eunyee and Kim, Sungchul and Cohen, Scott and Rossi, Ryan},
  booktitle = {Winter Conference on Applications of Computer Vision},
  year      = {2020},
  url       = {https://mlanthology.org/wacv/2020/chen2020wacv-figure/}
}