Interpretable Visual Reasoning via Induced Symbolic Space

Abstract

We study the problem of concept induction in visual reasoning, i.e., identifying concepts and their hierarchical relationships from question-answer pairs associated with images; and achieve an interpretable model via working on the induced symbolic concept space. To this end, we first design a new framework named object-centric compositional attention model (OCCAM) to perform the visual reasoning task with object-level visual features. Then, we come up with a method to induce concepts of objects and relations using clues from the attention patterns between objects' visual features and question words. Finally, we achieve a higher level of interpretability by imposing OCCAM on the objects represented in the induced symbolic concept space. Experiments on the CLEVR and GQA datasets demonstrate: 1) our OCCAM achieves a new state of the art without human-annotated functional programs; 2) our induced concepts are both accurate and sufficient as OCCAM achieves an on-par performance on objects represented either in visual features or in the induced symbolic concept space.

Cite

Text

Wang et al. "Interpretable Visual Reasoning via Induced Symbolic Space." International Conference on Computer Vision, 2021. doi:10.1109/ICCV48922.2021.00189

Markdown

[Wang et al. "Interpretable Visual Reasoning via Induced Symbolic Space." International Conference on Computer Vision, 2021.](https://mlanthology.org/iccv/2021/wang2021iccv-interpretable/) doi:10.1109/ICCV48922.2021.00189

BibTeX

@inproceedings{wang2021iccv-interpretable,
  title     = {{Interpretable Visual Reasoning via Induced Symbolic Space}},
  author    = {Wang, Zhonghao and Wang, Kai and Yu, Mo and Xiong, Jinjun and Hwu, Wen-mei and Hasegawa-Johnson, Mark and Shi, Humphrey},
  booktitle = {International Conference on Computer Vision},
  year      = {2021},
  pages     = {1878-1887},
  doi       = {10.1109/ICCV48922.2021.00189},
  url       = {https://mlanthology.org/iccv/2021/wang2021iccv-interpretable/}
}