OED: Towards One-Stage End-to-End Dynamic Scene Graph Generation

Abstract

Dynamic Scene Graph Generation (DSGG) focuses on identifying visual relationships within the spatial-temporal domain of videos. Conventional approaches often employ multi-stage pipelines which typically consist of object detection temporal association and multi-relation classification. However these methods exhibit inherent limitations due to the separation of multiple stages and independent optimization of these sub-problems may yield sub-optimal solutions. To remedy these limitations we propose a one-stage end-to-end framework termed OED which streamlines the DSGG pipeline. This framework reformulates the task as a set prediction problem and leverages pair-wise features to represent each subject-object pair within the scene graph. Moreover another challenge of DSGG is capturing temporal dependencies we introduce a Progressively Refined Module (PRM) for aggregating temporal context without the constraints of additional trackers or handcrafted trajectories enabling end-to-end optimization of the network. Extensive experiments conducted on the Action Genome benchmark demonstrate the effectiveness of our design. The code and models are available at https://github.com/guanw-pku/OED.

Cite

Text

Wang et al. "OED: Towards One-Stage End-to-End Dynamic Scene Graph Generation." Conference on Computer Vision and Pattern Recognition, 2024. doi:10.1109/CVPR52733.2024.02639

Markdown

[Wang et al. "OED: Towards One-Stage End-to-End Dynamic Scene Graph Generation." Conference on Computer Vision and Pattern Recognition, 2024.](https://mlanthology.org/cvpr/2024/wang2024cvpr-oed/) doi:10.1109/CVPR52733.2024.02639

BibTeX

@inproceedings{wang2024cvpr-oed,
  title     = {{OED: Towards One-Stage End-to-End Dynamic Scene Graph Generation}},
  author    = {Wang, Guan and Li, Zhimin and Chen, Qingchao and Liu, Yang},
  booktitle = {Conference on Computer Vision and Pattern Recognition},
  year      = {2024},
  pages     = {27938-27947},
  doi       = {10.1109/CVPR52733.2024.02639},
  url       = {https://mlanthology.org/cvpr/2024/wang2024cvpr-oed/}
}