Multi-Prototype Space Learning for Commonsense-Based Scene Graph Generation

Lianggangxu Chen, Youqi Song, Yiqing Cai, Jiale Lu, Yang Li, Yuan Xie, Changbo Wang, Gaoqi He

AAAI 2024 pp. 1129-1137

doi:10.1609/AAAI.V38I2.27874 /aaai/2024/chen2024aaai-multi/

Abstract

In the domain of scene graph generation, modeling commonsense as a single-prototype representation has been typically employed to facilitate the recognition of infrequent predicates. However, a fundamental challenge lies in the large intra-class variations of the visual appearance of predicates, resulting in subclasses within a predicate class. Such a challenge typically leads to the problem of misclassifying diverse predicates due to the rough predicate space clustering. In this paper, inspired by cognitive science, we maintain multi-prototype representations for each predicate class, which can accurately find the multiple class centers of the predicate space. Technically, we propose a novel multi-prototype learning framework consisting of three main steps: prototype-predicate matching, prototype updating, and prototype space optimization. We first design a triple-level optimal transport to match each predicate feature within the same class to a specific prototype. In addition, the prototypes are updated using momentum updating to find the class centers according to the matching results. Finally, we enhance the inter-class separability of the prototype space through iterations of the inter-class separability loss and intra-class compactness loss. Extensive evaluations demonstrate that our approach significantly outperforms state-of-the-art methods on the Visual Genome dataset.

PDF AAAI Semantic Scholar

Cite

Text

Chen et al. "Multi-Prototype Space Learning for Commonsense-Based Scene Graph Generation." AAAI Conference on Artificial Intelligence, 2024. doi:10.1609/AAAI.V38I2.27874

Markdown

[Chen et al. "Multi-Prototype Space Learning for Commonsense-Based Scene Graph Generation." AAAI Conference on Artificial Intelligence, 2024.](https://mlanthology.org/aaai/2024/chen2024aaai-multi/) doi:10.1609/AAAI.V38I2.27874

BibTeX

@inproceedings{chen2024aaai-multi,
  title     = {{Multi-Prototype Space Learning for Commonsense-Based Scene Graph Generation}},
  author    = {Chen, Lianggangxu and Song, Youqi and Cai, Yiqing and Lu, Jiale and Li, Yang and Xie, Yuan and Wang, Changbo and He, Gaoqi},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {2024},
  pages     = {1129-1137},
  doi       = {10.1609/AAAI.V38I2.27874},
  url       = {https://mlanthology.org/aaai/2024/chen2024aaai-multi/}
}