DOGE: LLMs-Enhanced Hyper-Knowledge Graph Recommender for Multimodal Recommendation
Abstract
In recent years, there has been a burgeoning interest in multimodal recommender systems within the recommendation systems domain. These systems aim to understand user preferences by leveraging both user interaction data and multimodal information associated with items. This approach frequently results in superior recommendation accuracy compared to traditional models that rely solely on user-item interactions. Despite the advancements of these methods, there is a relatively low utilization of image features in propagating item-item characteristics, an overreliance on text feature similarity, and a frequent neglect of the deep relationships between items, users, and modalities. In response to these challenges, we introduce a novel model termed LLMs-Enhanced Hyper-Knowledge Graph Recommender for Multimodal Recommendation (DOGE). DOGE utilizes large language models (LLMs) to understand image information under the guidance of text information, generating cross-modal features that effectively enhance the relationship between text and image modalities. Subsequently, DOGE constructs a Hyper-Knowledge Graph (HKG) using user-item interaction information and modality features enhanced by large language models. This graph encompasses a wide range of item-item and user-user binary relations and hyper-relations, effectively expanding the feature propagation mechanisms and mitigating the overreliance on text modality. By learning on heterogeneous user-item graphs and homogeneous item-item, user-user graphs, DOGE enhances potential effective propagation between item features and user features, acquiring more effective feature representations of users and items. Comprehensive experimentation across three public real-world datasets illustrates that DOGE attains state-of-the-art (SOTA) performance, exhibiting a 7.2% improvement over the strongest baseline.
Cite
Text
Meng et al. "DOGE: LLMs-Enhanced Hyper-Knowledge Graph Recommender for Multimodal Recommendation." AAAI Conference on Artificial Intelligence, 2025. doi:10.1609/AAAI.V39I12.33351Markdown
[Meng et al. "DOGE: LLMs-Enhanced Hyper-Knowledge Graph Recommender for Multimodal Recommendation." AAAI Conference on Artificial Intelligence, 2025.](https://mlanthology.org/aaai/2025/meng2025aaai-doge/) doi:10.1609/AAAI.V39I12.33351BibTeX
@inproceedings{meng2025aaai-doge,
title = {{DOGE: LLMs-Enhanced Hyper-Knowledge Graph Recommender for Multimodal Recommendation}},
author = {Meng, Fanshen and Meng, Zhenhua and Jin, Ru and Lin, Rongheng and Wu, Budan},
booktitle = {AAAI Conference on Artificial Intelligence},
year = {2025},
pages = {12399-12407},
doi = {10.1609/AAAI.V39I12.33351},
url = {https://mlanthology.org/aaai/2025/meng2025aaai-doge/}
}