Inverse Compositional Learning for Weakly-Supervised Relation Grounding

Abstract

Video relation grounding (VRG) is a significant and challenging problem in the domains of cross-modal learning and video understanding. In this study, we introduce a novel approach called inverse compositional learning (ICL) for weakly-supervised video relation grounding. Our approach represents relations at both the holistic and partial levels, formulating VRG as a joint optimization problem that encompasses reasoning at both levels. For holistic-level reasoning, we propose an inverse attention mechanism and a compositional encoder to generate compositional relevance features. Additionally, we introduce an inverse loss to evaluate and learn the relevance between visual features and relation features. At the partial-level reasoning, we introduce a grounding by classification scheme. By leveraging the learned holistic-level features and partial-level features, we train the entire model in an end-to-end manner. We conduct evaluations on two challenging datasets and demonstrate the substantial superiority of our proposed method over state-of-the-art methods. Extensive ablation studies confirm the effectiveness of our approach.

Cite

Text

Li et al. "Inverse Compositional Learning for Weakly-Supervised Relation Grounding." International Conference on Computer Vision, 2023. doi:10.1109/ICCV51070.2023.01419

Markdown

[Li et al. "Inverse Compositional Learning for Weakly-Supervised Relation Grounding." International Conference on Computer Vision, 2023.](https://mlanthology.org/iccv/2023/li2023iccv-inverse/) doi:10.1109/ICCV51070.2023.01419

BibTeX

@inproceedings{li2023iccv-inverse,
  title     = {{Inverse Compositional Learning for Weakly-Supervised Relation Grounding}},
  author    = {Li, Huan and Wei, Ping and Ma, Zeyu and Zheng, Nanning},
  booktitle = {International Conference on Computer Vision},
  year      = {2023},
  pages     = {15477-15487},
  doi       = {10.1109/ICCV51070.2023.01419},
  url       = {https://mlanthology.org/iccv/2023/li2023iccv-inverse/}
}