Multi-Modal Knowledge Hypergraph for Diverse Image Retrieval

Abstract

The task of keyword-based diverse image retrieval has received considerable attention due to its wide demand in real-world scenarios. Existing methods either rely on a multi-stage re-ranking strategy based on human design to diversify results, or extend sub-semantics via an implicit generator, which either relies on manual labor or lacks explainability. To learn more diverse and explainable representations, we capture sub-semantics in an explicit manner by leveraging the multi-modal knowledge graph (MMKG) that contains richer entities and relations. However, the huge domain gap between the off-the-shelf MMKG and retrieval datasets, as well as the semantic gap between images and texts, make the fusion of MMKG difficult. In this paper, we pioneer a degree-free hypergraph solution that models many-to-many relations to address the challenge of heterogeneous sources and heterogeneous modalities. Specifically, a hyperlink-based solution, Multi-Modal Knowledge Hyper Graph (MKHG) is proposed, which bridges heterogeneous data via various hyperlinks to diversify sub-semantics. Among them, a hypergraph construction module first customizes various hyperedges to link the heterogeneous MMKG and retrieval databases. A multi-modal instance bagging module then explicitly selects instances to diversify the semantics. Meanwhile, a diverse concept aggregator flexibly adapts key sub-semantics. Finally, several losses are adopted to optimize the semantic space. Extensive experiments on two real-world datasets have well verified the effectiveness and explainability of our proposed method.

Cite

Text

Zeng et al. "Multi-Modal Knowledge Hypergraph for Diverse Image Retrieval." AAAI Conference on Artificial Intelligence, 2023. doi:10.1609/AAAI.V37I3.25445

Markdown

[Zeng et al. "Multi-Modal Knowledge Hypergraph for Diverse Image Retrieval." AAAI Conference on Artificial Intelligence, 2023.](https://mlanthology.org/aaai/2023/zeng2023aaai-multi/) doi:10.1609/AAAI.V37I3.25445

BibTeX

@inproceedings{zeng2023aaai-multi,
  title     = {{Multi-Modal Knowledge Hypergraph for Diverse Image Retrieval}},
  author    = {Zeng, Yawen and Jin, Qin and Bao, Tengfei and Li, Wenfeng},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {2023},
  pages     = {3376-3383},
  doi       = {10.1609/AAAI.V37I3.25445},
  url       = {https://mlanthology.org/aaai/2023/zeng2023aaai-multi/}
}