LangXAI: Integrating Large Vision Models for Generating Textual Explanations to Enhance Explainability in Visual Perception Tasks

Nguyen, Truong Thanh Hung; Clement, Tobias; Nguyen, Phuc Truong Loc; Kemmerzell, Nils; Truong, Van Binh; Nguyen, Vo Thanh Khang; Abdelaal, Mohamed; Cao, Hung

doi:10.24963/ijcai.2024/1025

LangXAI: Integrating Large Vision Models for Generating Textual Explanations to Enhance Explainability in Visual Perception Tasks

Truong Thanh Hung Nguyen, Tobias Clement, Phuc Truong Loc Nguyen, Nils Kemmerzell, Van Binh Truong, Vo Thanh Khang Nguyen, Mohamed Abdelaal, Hung Cao

IJCAI 2024 pp. 8754-8758

doi:10.24963/ijcai.2024/1025 /ijcai/2024/nguyen2024ijcai-langxai/

Abstract

Internet of Things (IoT) sensors are ubiquitous technologies deployed across smart cities, industrial sites, and healthcare systems. They continuously generate time series data that enable advanced analytics and automation in industries. However, challenges such as the loss or ambiguity of sensor metadata, heterogeneity in data sources, varying sampling frequencies, inconsistent units of measurement, and irregular timestamps make raw IoT time series data difficult to interpret, undermining the effectiveness of smart systems. To address these challenges, we propose a novel deep learning model, DeepFeatIoT, which integrates learned local and global features with non-learned randomized convolutional kernel-based features and features from large language models (LLMs). This straightforward yet unique fusion of diverse learned and non-learned features significantly enhances IoT time series sensor data classification, even in scenarios with limited labeled data. Our model's effectiveness is demonstrated through its consistent and generalized performance across multiple real-world IoT sensor datasets from diverse critical application domains, outperforming state-of-the-art benchmark models. These results highlight DeepFeatIoT's potential to drive significant advancements in IoT analytics and support the development of next-generation smart systems.

PDF IJCAI Semantic Scholar

Cite

Text

Nguyen et al. "LangXAI: Integrating Large Vision Models for Generating Textual Explanations to Enhance Explainability in Visual Perception Tasks." International Joint Conference on Artificial Intelligence, 2024. doi:10.24963/ijcai.2024/1025

Markdown

[Nguyen et al. "LangXAI: Integrating Large Vision Models for Generating Textual Explanations to Enhance Explainability in Visual Perception Tasks." International Joint Conference on Artificial Intelligence, 2024.](https://mlanthology.org/ijcai/2024/nguyen2024ijcai-langxai/) doi:10.24963/ijcai.2024/1025

BibTeX

@inproceedings{nguyen2024ijcai-langxai,
  title     = {{LangXAI: Integrating Large Vision Models for Generating Textual Explanations to Enhance Explainability in Visual Perception Tasks}},
  author    = {Nguyen, Truong Thanh Hung and Clement, Tobias and Nguyen, Phuc Truong Loc and Kemmerzell, Nils and Truong, Van Binh and Nguyen, Vo Thanh Khang and Abdelaal, Mohamed and Cao, Hung},
  booktitle = {International Joint Conference on Artificial Intelligence},
  year      = {2024},
  pages     = {8754-8758},
  doi       = {10.24963/ijcai.2024/1025},
  url       = {https://mlanthology.org/ijcai/2024/nguyen2024ijcai-langxai/}
}