A Simple Baseline for Weakly-Supervised Scene Graph Generation
Abstract
We investigate the weakly-supervised scene graph generation, which is a challenging task since no correspondence of label and object is provided. The previous work regards such correspondence as a latent variable which is iteratively updated via nested optimization of the scene graph generation objective. However, we further reduce the complexity by decoupling it into an efficient first-order graph matching module optimized via contrastive learning to obtain such correspondence, which is used to train a standard scene graph generation model. The extensive experiments show that such a simple pipeline can significantly surpass the previous state-of-the-art by more than 30% on the Visual Genome dataset, both in terms of graph matching accuracy and scene graph quality. We believe this work serves as a strong baseline for future research.
Cite
Text
Shi et al. "A Simple Baseline for Weakly-Supervised Scene Graph Generation." International Conference on Computer Vision, 2021. doi:10.1109/ICCV48922.2021.01608Markdown
[Shi et al. "A Simple Baseline for Weakly-Supervised Scene Graph Generation." International Conference on Computer Vision, 2021.](https://mlanthology.org/iccv/2021/shi2021iccv-simple/) doi:10.1109/ICCV48922.2021.01608BibTeX
@inproceedings{shi2021iccv-simple,
title = {{A Simple Baseline for Weakly-Supervised Scene Graph Generation}},
author = {Shi, Jing and Zhong, Yiwu and Xu, Ning and Li, Yin and Xu, Chenliang},
booktitle = {International Conference on Computer Vision},
year = {2021},
pages = {16393-16402},
doi = {10.1109/ICCV48922.2021.01608},
url = {https://mlanthology.org/iccv/2021/shi2021iccv-simple/}
}