Two Heads Are Better than One: Hypergraph-Enhanced Graph Reasoning for Visual Event Ratiocination

Abstract

Even with a still image, humans can ratiocinate various visual cause-and-effect descriptions before, at present, and after, as well as beyond the given image. However, it is challenging for models to achieve such task–the visual event ratiocination, owing to the limitations of time and space. To this end, we propose a novel multi-modal model, Hypergraph-Enhanced Graph Reasoning. First it represents the contents from the same modality as a semantic graph and mines the intra-modality relationship, therefore breaking the limitations in the spatial domain. Then, we introduce the Graph Self-Attention Enhancement. On the one hand, this enables semantic graph representations from different modalities to enhance each other and captures the inter-modality relationship along the line. On the other hand, it utilizes our built multi-modal hypergraphs in different moments to boost individual semantic graph representations, and breaks the limitations in the temporal domain. Our method illustrates the case of "two heads are better than one" in the sense that semantic graph representations with the help of the proposed enhancement mechanism are more robust than those without. Finally, we re-project these representations and leverage their outcomes to generate textual cause-and-effect descriptions. Experimental results show that our model achieves significantly higher performance in comparison with other state-of-the-arts.

Cite

Text

Zheng et al. "Two Heads Are Better than One: Hypergraph-Enhanced Graph Reasoning for Visual Event Ratiocination." International Conference on Machine Learning, 2021.

Markdown

[Zheng et al. "Two Heads Are Better than One: Hypergraph-Enhanced Graph Reasoning for Visual Event Ratiocination." International Conference on Machine Learning, 2021.](https://mlanthology.org/icml/2021/zheng2021icml-two/)

BibTeX

@inproceedings{zheng2021icml-two,
  title     = {{Two Heads Are Better than One: Hypergraph-Enhanced Graph Reasoning for Visual Event Ratiocination}},
  author    = {Zheng, Wenbo and Yan, Lan and Gou, Chao and Wang, Fei-Yue},
  booktitle = {International Conference on Machine Learning},
  year      = {2021},
  pages     = {12747-12760},
  volume    = {139},
  url       = {https://mlanthology.org/icml/2021/zheng2021icml-two/}
}