Knowledge Informed Sequential Scene Graph Verification Using VQA

ICCVW 2023 pp. 21-31

doi:10.1109/ICCVW60793.2023.00009 /iccvw/2023/thauvin2023iccvw-knowledge/

Abstract

We propose a new task, non localized scene graph verification, whose objective is to provide a justified expression of inconsistencies between the visual content of the image and its non-localized scene graph in order to diagnose errors or anticipate corrections. We introduce a sequential algorithm capable of detecting and proposing plausible corrections, taking into account the information already present in the scene graph and exploiting knowledge priors. Instead of relying on object detection that requires bounding box annotations, we use a simple visual question answering (VQA) as a proxy for visual content analysis. We show on the VG150 dataset that our strategy is efficient compared to a baseline adapted from a caption editing approach. We also show that our algorithm is able to efficiently correct corrupted scene graphs.

ICCVW Semantic Scholar

Cite

Text

Thauvin and Herbin. "Knowledge Informed Sequential Scene Graph Verification Using VQA." IEEE/CVF International Conference on Computer Vision Workshops, 2023. doi:10.1109/ICCVW60793.2023.00009

Markdown

[Thauvin and Herbin. "Knowledge Informed Sequential Scene Graph Verification Using VQA." IEEE/CVF International Conference on Computer Vision Workshops, 2023.](https://mlanthology.org/iccvw/2023/thauvin2023iccvw-knowledge/) doi:10.1109/ICCVW60793.2023.00009

BibTeX

@inproceedings{thauvin2023iccvw-knowledge,
  title     = {{Knowledge Informed Sequential Scene Graph Verification Using VQA}},
  author    = {Thauvin, Dao and Herbin, Stéphane},
  booktitle = {IEEE/CVF International Conference on Computer Vision Workshops},
  year      = {2023},
  pages     = {21-31},
  doi       = {10.1109/ICCVW60793.2023.00009},
  url       = {https://mlanthology.org/iccvw/2023/thauvin2023iccvw-knowledge/}
}