DOGE: LLMs-Enhanced Hyper-Knowledge Graph Recommender for Multimodal Recommendation

Meng, Fanshen; Meng, Zhenhua; Jin, Ru; Lin, Rongheng; Wu, Budan

doi:10.1609/AAAI.V39I12.33351

DOGE: LLMs-Enhanced Hyper-Knowledge Graph Recommender for Multimodal Recommendation

Fanshen Meng, Zhenhua Meng, Ru Jin, Rongheng Lin, Budan Wu

AAAI 2025 pp. 12399-12407

doi:10.1609/AAAI.V39I12.33351 /aaai/2025/meng2025aaai-doge/

Abstract

In recent years, there has been a burgeoning interest in multimodal recommender systems within the recommendation systems domain. These systems aim to understand user preferences by leveraging both user interaction data and multimodal information associated with items. This approach frequently results in superior recommendation accuracy compared to traditional models that rely solely on user-item interactions. Despite the advancements of these methods, there is a relatively low utilization of image features in propagating item-item characteristics, an overreliance on text feature similarity, and a frequent neglect of the deep relationships between items, users, and modalities. In response to these challenges, we introduce a novel model termed LLMs-Enhanced Hyper-Knowledge Graph Recommender for Multimodal Recommendation (DOGE). DOGE utilizes large language models (LLMs) to understand image information under the guidance of text information, generating cross-modal features that effectively enhance the relationship between text and image modalities. Subsequently, DOGE constructs a Hyper-Knowledge Graph (HKG) using user-item interaction information and modality features enhanced by large language models. This graph encompasses a wide range of item-item and user-user binary relations and hyper-relations, effectively expanding the feature propagation mechanisms and mitigating the overreliance on text modality. By learning on heterogeneous user-item graphs and homogeneous item-item, user-user graphs, DOGE enhances potential effective propagation between item features and user features, acquiring more effective feature representations of users and items. Comprehensive experimentation across three public real-world datasets illustrates that DOGE attains state-of-the-art (SOTA) performance, exhibiting a 7.2% improvement over the strongest baseline.

PDF AAAI Semantic Scholar

Cite

Text

Meng et al. "DOGE: LLMs-Enhanced Hyper-Knowledge Graph Recommender for Multimodal Recommendation." AAAI Conference on Artificial Intelligence, 2025. doi:10.1609/AAAI.V39I12.33351

Markdown

[Meng et al. "DOGE: LLMs-Enhanced Hyper-Knowledge Graph Recommender for Multimodal Recommendation." AAAI Conference on Artificial Intelligence, 2025.](https://mlanthology.org/aaai/2025/meng2025aaai-doge/) doi:10.1609/AAAI.V39I12.33351

BibTeX

@inproceedings{meng2025aaai-doge,
  title     = {{DOGE: LLMs-Enhanced Hyper-Knowledge Graph Recommender for Multimodal Recommendation}},
  author    = {Meng, Fanshen and Meng, Zhenhua and Jin, Ru and Lin, Rongheng and Wu, Budan},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {2025},
  pages     = {12399-12407},
  doi       = {10.1609/AAAI.V39I12.33351},
  url       = {https://mlanthology.org/aaai/2025/meng2025aaai-doge/}
}