Knowledge Informed Sequential Scene Graph Verification Using VQA
Abstract
We propose a new task, non localized scene graph verification, whose objective is to provide a justified expression of inconsistencies between the visual content of the image and its non-localized scene graph in order to diagnose errors or anticipate corrections. We introduce a sequential algorithm capable of detecting and proposing plausible corrections, taking into account the information already present in the scene graph and exploiting knowledge priors. Instead of relying on object detection that requires bounding box annotations, we use a simple visual question answering (VQA) as a proxy for visual content analysis. We show on the VG150 dataset that our strategy is efficient compared to a baseline adapted from a caption editing approach. We also show that our algorithm is able to efficiently correct corrupted scene graphs.
Cite
Text
Thauvin and Herbin. "Knowledge Informed Sequential Scene Graph Verification Using VQA." IEEE/CVF International Conference on Computer Vision Workshops, 2023. doi:10.1109/ICCVW60793.2023.00009Markdown
[Thauvin and Herbin. "Knowledge Informed Sequential Scene Graph Verification Using VQA." IEEE/CVF International Conference on Computer Vision Workshops, 2023.](https://mlanthology.org/iccvw/2023/thauvin2023iccvw-knowledge/) doi:10.1109/ICCVW60793.2023.00009BibTeX
@inproceedings{thauvin2023iccvw-knowledge,
title = {{Knowledge Informed Sequential Scene Graph Verification Using VQA}},
author = {Thauvin, Dao and Herbin, Stéphane},
booktitle = {IEEE/CVF International Conference on Computer Vision Workshops},
year = {2023},
pages = {21-31},
doi = {10.1109/ICCVW60793.2023.00009},
url = {https://mlanthology.org/iccvw/2023/thauvin2023iccvw-knowledge/}
}