HiKER-SGG: Hierarchical Knowledge Enhanced Robust Scene Graph Generation

Abstract

Being able to understand visual scenes is a precursor for many downstream tasks including autonomous driving robotics and other vision-based approaches. A common approach enabling the ability to reason over visual data is Scene Graph Generation (SGG); however many existing approaches assume undisturbed vision i.e. the absence of real-world corruptions such as fog snow smoke as well as non-uniform perturbations like sun glare or water drops. In this work we propose a novel SGG benchmark containing procedurally generated weather corruptions and other transformations over the Visual Genome dataset. Further we introduce a corresponding approach Hierarchical Knowledge Enhanced Robust Scene Graph Generation (HiKER-SGG) providing a strong baseline for scene graph generation under such challenging setting. At its core HiKER-SGG utilizes a hierarchical knowledge graph in order to refine its predictions from coarse initial estimates to detailed predictions. In our extensive experiments we show that HiKER-SGG does not only demonstrate superior performance on corrupted images in a zero-shot manner but also outperforms current state-of-the-art methods on uncorrupted SGG tasks. Code is available at https://github.com/zhangce01/HiKER-SGG.

Cite

Text

Zhang et al. "HiKER-SGG: Hierarchical Knowledge Enhanced Robust Scene Graph Generation." Conference on Computer Vision and Pattern Recognition, 2024. doi:10.1109/CVPR52733.2024.02667

Markdown

[Zhang et al. "HiKER-SGG: Hierarchical Knowledge Enhanced Robust Scene Graph Generation." Conference on Computer Vision and Pattern Recognition, 2024.](https://mlanthology.org/cvpr/2024/zhang2024cvpr-hikersgg/) doi:10.1109/CVPR52733.2024.02667

BibTeX

@inproceedings{zhang2024cvpr-hikersgg,
  title     = {{HiKER-SGG: Hierarchical Knowledge Enhanced Robust Scene Graph Generation}},
  author    = {Zhang, Ce and Stepputtis, Simon and Campbell, Joseph and Sycara, Katia and Xie, Yaqi},
  booktitle = {Conference on Computer Vision and Pattern Recognition},
  year      = {2024},
  pages     = {28233-28243},
  doi       = {10.1109/CVPR52733.2024.02667},
  url       = {https://mlanthology.org/cvpr/2024/zhang2024cvpr-hikersgg/}
}