Trusted Multi-View Learning with Label Noise
Abstract
Multimodal intent recognition (MIR) seeks to accurately interpret user intentions by integrating verbal and non-verbal information across video, audio and text modalities. While existing approaches prioritize text analysis, they often overlook the rich semantic content embedded in non-verbal cues. This paper presents a novel Wavelet-Driven Multimodal Intent Recognition (WDMIR) framework that enhances intent understanding through frequency-domain analysis of non-verbal information. To be more specific, we propose: (1) a wavelet-driven fusion module that performs synchronized decomposition and integration of video-audio features in the frequency domain, enabling fine-grained analysis of temporal dynamics; (2) a cross-modal interaction mechanism that facilitates progressive feature enhancement from bimodal to trimodal integration, effectively bridging the semantic gap between verbal and non-verbal information. Extensive experiments on MIntRec demonstrate that our approach achieves state-of-the-art performance, surpassing previous methods by 1.13% on accuracy. Ablation studies further verify that the wavelet-driven fusion module significantly improves the extraction of semantic information from non-verbal sources, with a 0.41% increase in recognition accuracy when analyzing subtle emotional cues.
Cite
Text
Xu et al. "Trusted Multi-View Learning with Label Noise." International Joint Conference on Artificial Intelligence, 2024. doi:10.24963/ijcai.2024/582Markdown
[Xu et al. "Trusted Multi-View Learning with Label Noise." International Joint Conference on Artificial Intelligence, 2024.](https://mlanthology.org/ijcai/2024/xu2024ijcai-trusted/) doi:10.24963/ijcai.2024/582BibTeX
@inproceedings{xu2024ijcai-trusted,
title = {{Trusted Multi-View Learning with Label Noise}},
author = {Xu, Cai and Zhang, Yilin and Guan, Ziyu and Zhao, Wei},
booktitle = {International Joint Conference on Artificial Intelligence},
year = {2024},
pages = {5263-5271},
doi = {10.24963/ijcai.2024/582},
url = {https://mlanthology.org/ijcai/2024/xu2024ijcai-trusted/}
}