An Entanglement-Driven Fusion Neural Network for Video Sentiment Analysis
Abstract
Video data is multimodal in its nature, where an utterance can involve linguistic, visual and acoustic information. Therefore, a key challenge for video sentiment analysis is how to combine different modalities for sentiment recognition effectively. The latest neural network approaches achieve state-of-the-art performance, but they neglect to a large degree of how humans understand and reason about sentiment states. By contrast, recent advances in quantum probabilistic neural models have achieved comparable performance to the state-of-the-art, yet with better transparency and increased level of interpretability. However, the existing quantum-inspired models treat quantum states as either a classical mixture or as a separable tensor product across modalities, without triggering their interactions in a way that they are correlated or non-separable (i.e., entangled). This means that the current models have not fully exploited the expressive power of quantum probabilities. To fill this gap, we propose a transparent quantum probabilistic neural model. The model induces different modalities to interact in such a way that they may not be separable, encoding crossmodal information in the form of non-classical correlations. Comprehensive evaluation on two benchmarking datasets for video sentiment analysis shows that the model achieves significant performance improvement. We also show that the degree of non-separability between modalities optimizes the post-hoc interpretability.
Cite
Text
Gkoumas et al. "An Entanglement-Driven Fusion Neural Network for Video Sentiment Analysis." International Joint Conference on Artificial Intelligence, 2021. doi:10.24963/IJCAI.2021/239Markdown
[Gkoumas et al. "An Entanglement-Driven Fusion Neural Network for Video Sentiment Analysis." International Joint Conference on Artificial Intelligence, 2021.](https://mlanthology.org/ijcai/2021/gkoumas2021ijcai-entanglement/) doi:10.24963/IJCAI.2021/239BibTeX
@inproceedings{gkoumas2021ijcai-entanglement,
title = {{An Entanglement-Driven Fusion Neural Network for Video Sentiment Analysis}},
author = {Gkoumas, Dimitris and Li, Qiuchi and Yu, Yijun and Song, Dawei},
booktitle = {International Joint Conference on Artificial Intelligence},
year = {2021},
pages = {1736-1742},
doi = {10.24963/IJCAI.2021/239},
url = {https://mlanthology.org/ijcai/2021/gkoumas2021ijcai-entanglement/}
}