KAM-CoT: Knowledge Augmented Multimodal Chain-of-Thoughts Reasoning

Mondal, Debjyoti; Modi, Suraj; Panda, Subhadarshi; Singh, Rituraj; Rao, Godawari Sudhakar

doi:10.1609/AAAI.V38I17.29844

KAM-CoT: Knowledge Augmented Multimodal Chain-of-Thoughts Reasoning

Debjyoti Mondal, Suraj Modi, Subhadarshi Panda, Rituraj Singh, Godawari Sudhakar Rao

AAAI 2024 pp. 18798-18806

doi:10.1609/AAAI.V38I17.29844 /aaai/2024/mondal2024aaai-kam/

Abstract

Large Language Models (LLMs) have demonstrated impressive performance in natural language processing tasks by leveraging chain of thought (CoT) that enables step-by-step thinking. Extending LLMs with multimodal capabilities is the recent interest, but incurs computational cost and requires substantial hardware resources. To address these challenges, we propose KAM-CoT a framework that integrates CoT reasoning, Knowledge Graphs (KGs), and multiple modalities for a comprehensive understanding of multimodal tasks. KAM-CoT adopts a two-stage training process with KG grounding to generate effective rationales and answers. By incorporating external knowledge from KGs during reasoning, the model gains a deeper contextual understanding reducing hallucinations and enhancing the quality of answers. This knowledge-augmented CoT reasoning empowers the model to handle questions requiring external context, providing more informed answers. Experimental findings show KAM-CoT outperforms the state-of-the-art methods. On the ScienceQA dataset, we achieve an average accuracy of 93.87%, surpassing GPT-3.5 (75.17%) by 18% and GPT-4 (83.99%) by 10%. Remarkably, KAM-CoT achieves these results with only 280M trainable parameters at a time, demonstrating its cost-efficiency and effectiveness.

PDF AAAI Semantic Scholar

Cite

Text

Mondal et al. "KAM-CoT: Knowledge Augmented Multimodal Chain-of-Thoughts Reasoning." AAAI Conference on Artificial Intelligence, 2024. doi:10.1609/AAAI.V38I17.29844

Markdown

[Mondal et al. "KAM-CoT: Knowledge Augmented Multimodal Chain-of-Thoughts Reasoning." AAAI Conference on Artificial Intelligence, 2024.](https://mlanthology.org/aaai/2024/mondal2024aaai-kam/) doi:10.1609/AAAI.V38I17.29844

BibTeX

@inproceedings{mondal2024aaai-kam,
  title     = {{KAM-CoT: Knowledge Augmented Multimodal Chain-of-Thoughts Reasoning}},
  author    = {Mondal, Debjyoti and Modi, Suraj and Panda, Subhadarshi and Singh, Rituraj and Rao, Godawari Sudhakar},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {2024},
  pages     = {18798-18806},
  doi       = {10.1609/AAAI.V38I17.29844},
  url       = {https://mlanthology.org/aaai/2024/mondal2024aaai-kam/}
}