Multimodal Knowledge Retrieval-Augmented Iterative Alignment for Satellite Commonsense Conversation

Abstract

Satellite technology has significantly influenced our daily lives, manifested in applications such as navigation and communication. With its development, a vast amount of multimodal satellite commonsense data has been generated, thus leading to an urgent demand for conversation about satellite data. However, existing large language models suffer from prevalent hallucinations and poor comprehensibility on multimodal satellite data due to their high professional content threshold and partial information opacity. To address these issues, we propose a multimodal satellite knowledge retrieval-augmented iterative alignment framework (Sat-RIA) for satellite commonsense conversation. We first construct multi-view retrieval expert knowledge to reduce hallucinations and enhance the interpretability of responses, which incorporates the satellite expert database, satellite rule, satellite image database, and a satellite knowledge graph. We next design commonsense conversation instructions to make the answers more legible and understandable. Furthermore, the retrieval-augmented iterative alignment module refines response precision by aligning outputs with task-specific standards through multi-stage evaluations. Finally, we construct satellite multi-turn dialogue and visual question-answer datasets for a more comprehensive evaluation of satellite commonsense conversation. Experimental results demonstrate that Sat-RIA outperforms existing large language models and provides more comprehensible answers with fewer hallucinations.

Cite

Text

Li et al. "Multimodal Knowledge Retrieval-Augmented Iterative Alignment for Satellite Commonsense Conversation." International Joint Conference on Artificial Intelligence, 2025. doi:10.24963/IJCAI.2025/908

Markdown

[Li et al. "Multimodal Knowledge Retrieval-Augmented Iterative Alignment for Satellite Commonsense Conversation." International Joint Conference on Artificial Intelligence, 2025.](https://mlanthology.org/ijcai/2025/li2025ijcai-multimodal/) doi:10.24963/IJCAI.2025/908

BibTeX

@inproceedings{li2025ijcai-multimodal,
  title     = {{Multimodal Knowledge Retrieval-Augmented Iterative Alignment for Satellite Commonsense Conversation}},
  author    = {Li, Qian and Li, Xuchen and Chang, Zongyu and Zhang, Yuzheng and Ji, Cheng and Wang, Shangguang},
  booktitle = {International Joint Conference on Artificial Intelligence},
  year      = {2025},
  pages     = {8168-8176},
  doi       = {10.24963/IJCAI.2025/908},
  url       = {https://mlanthology.org/ijcai/2025/li2025ijcai-multimodal/}
}