ChartMimic: Evaluating LMM's Cross-Modal Reasoning Capability via Chart-to-Code Generation
Abstract
We introduce a new benchmark, ChartMimic, aimed at assessing the visually-grounded code generation capabilities of large multimodal models (LMMs). ChartMimic utilizes information-intensive visual charts and textual instructions as inputs, requiring LMMs to generate the corresponding code for chart rendering. ChartMimic includes $4,800$ human-curated (figure, instruction, code) triplets, which represent the authentic chart use cases found in scientific papers across various domains (e.g., Physics, Computer Science, Economics, etc). These charts span $18$ regular types and $4$ advanced types, diversifying into $201$ subcategories. Furthermore, we propose multi-level evaluation metrics to provide an automatic and thorough assessment of the output code and the rendered charts. Unlike existing code generation benchmarks, ChartMimic places emphasis on evaluating LMMs' capacity to harmonize a blend of cognitive capabilities, encompassing visual understanding, code generation, and cross-modal reasoning. The evaluation of $3$ proprietary models and $14$ open-weight models highlights the substantial challenges posed by ChartMimic. Even the advanced GPT-4o, InternVL2-Llama3-76B only achieved an average score across Direct Mimic and Customized Mimic tasks of $82.2$ and $61.6$, respectively, indicating significant room for improvement. We anticipate that ChartMimic will inspire the development of LMMs, advancing the pursuit of artificial general intelligence.
Cite
Text
Yang et al. "ChartMimic: Evaluating LMM's Cross-Modal Reasoning Capability via Chart-to-Code Generation." International Conference on Learning Representations, 2025.Markdown
[Yang et al. "ChartMimic: Evaluating LMM's Cross-Modal Reasoning Capability via Chart-to-Code Generation." International Conference on Learning Representations, 2025.](https://mlanthology.org/iclr/2025/yang2025iclr-chartmimic/)BibTeX
@inproceedings{yang2025iclr-chartmimic,
title = {{ChartMimic: Evaluating LMM's Cross-Modal Reasoning Capability via Chart-to-Code Generation}},
author = {Yang, Cheng and Shi, Chufan and Liu, Yaxin and Shui, Bo and Wang, Junjie and Jing, Mohan and Xu, Linran and Zhu, Xinyu and Li, Siheng and Zhang, Yuxiang and Liu, Gongye and Nie, Xiaomei and Cai, Deng and Yang, Yujiu},
booktitle = {International Conference on Learning Representations},
year = {2025},
url = {https://mlanthology.org/iclr/2025/yang2025iclr-chartmimic/}
}