Diagram Understanding in Geometry Questions
Abstract
Automatically solving geometry questions is a long-standing AI problem. A geometry question typically includes a textual description accompanied by a diagram. The first step in solving geometry questions is diagram understanding, which consists of identifying visual elements in the diagram, their locations, their geometric properties, and aligning them to corresponding textual descriptions. In this paper, we present a method for diagram understanding that identifies visual elements in a diagram while maximizing agreement between textual and visual data. We show that the method's objective function is submodular; thus we are able to introduce an efficient method for diagram understanding that is close to optimal. To empirically evaluate our method, we compile a new dataset of geometry questions (textual descriptions and diagrams) and compare with baselines that utilize standard vision techniques. Our experimental evaluation shows an F1 boost of more than 17% in identifying visual elements and 25% in aligning visual elements with their textual descriptions.
Cite
Text
Seo et al. "Diagram Understanding in Geometry Questions." AAAI Conference on Artificial Intelligence, 2014. doi:10.1609/AAAI.V28I1.9146Markdown
[Seo et al. "Diagram Understanding in Geometry Questions." AAAI Conference on Artificial Intelligence, 2014.](https://mlanthology.org/aaai/2014/seo2014aaai-diagram/) doi:10.1609/AAAI.V28I1.9146BibTeX
@inproceedings{seo2014aaai-diagram,
title = {{Diagram Understanding in Geometry Questions}},
author = {Seo, Min Joon and Hajishirzi, Hannaneh and Farhadi, Ali and Etzioni, Oren},
booktitle = {AAAI Conference on Artificial Intelligence},
year = {2014},
pages = {2831-2838},
doi = {10.1609/AAAI.V28I1.9146},
url = {https://mlanthology.org/aaai/2014/seo2014aaai-diagram/}
}