MAFT: Multimodal Automated Fact-Checking via Textualization

Abstract

This paper proposes MAFT, a novel multimodal automated fact-checking system capable of handling content in any combination of text, images, videos, and audio. The core idea behind our system is the textualization of multimodal content using various machine learning techniques. MAFT comprehensively analyzes this textualized content along with external information collected via web APIs by large language models (LLMs). MAFT generates interpretable fact-checking reports that include not only verification results but also a detailed verification process. With its adaptability and ability to automatically verify multimodal content, MAFT contributes to the fight against the spread of multimodal misinformation.

Cite

Text

Kakizaki et al. "MAFT: Multimodal Automated Fact-Checking via Textualization." AAAI Conference on Artificial Intelligence, 2025. doi:10.1609/AAAI.V39I28.35354

Markdown

[Kakizaki et al. "MAFT: Multimodal Automated Fact-Checking via Textualization." AAAI Conference on Artificial Intelligence, 2025.](https://mlanthology.org/aaai/2025/kakizaki2025aaai-maft/) doi:10.1609/AAAI.V39I28.35354

BibTeX

@inproceedings{kakizaki2025aaai-maft,
  title     = {{MAFT: Multimodal Automated Fact-Checking via Textualization}},
  author    = {Kakizaki, Kazuya and Matsunaga, Yuto and Furukawa, Ryo},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {2025},
  pages     = {29646-29648},
  doi       = {10.1609/AAAI.V39I28.35354},
  url       = {https://mlanthology.org/aaai/2025/kakizaki2025aaai-maft/}
}