Emotion Analysis Using Audio/Video, EMG and EEG: A Dataset and Comparison Study
Abstract
This paper describes a study on automated emotion recognition using four different modalities - audio, video, electromyography (EMG), and electroencephalography (EEG). We collected a dataset using the 4 modalities as 12 human subjects expressed six different emotions or maintained a neutral expression. Three different aspects of emotion recognition were investigated: model selection, feature selection, and data selection. Both generative models (DBNs) and discriminative models (LSTMs) were applied to the four modalities, and from these analyses we conclude that LSTM is better for audio and video together with their corresponding sophisticated feature extractors (MFCC and CNN), whereas DBN is better for both EMG and EEG. By examining these signals at different stages (pre-speech, during-speech, and post-speech) of the current and following trials, we found that the most effective stages for emotion recognition from EEG occur after the emotion has been expressed, suggesting that the neural signals conveying an emotion are long-lasting.
Cite
Text
Abtahi et al. "Emotion Analysis Using Audio/Video, EMG and EEG: A Dataset and Comparison Study." IEEE/CVF Winter Conference on Applications of Computer Vision, 2018. doi:10.1109/WACV.2018.00008Markdown
[Abtahi et al. "Emotion Analysis Using Audio/Video, EMG and EEG: A Dataset and Comparison Study." IEEE/CVF Winter Conference on Applications of Computer Vision, 2018.](https://mlanthology.org/wacv/2018/abtahi2018wacv-emotion/) doi:10.1109/WACV.2018.00008BibTeX
@inproceedings{abtahi2018wacv-emotion,
title = {{Emotion Analysis Using Audio/Video, EMG and EEG: A Dataset and Comparison Study}},
author = {Abtahi, Farnaz and Ro, Tony and Li, Wei and Zhu, Zhigang},
booktitle = {IEEE/CVF Winter Conference on Applications of Computer Vision},
year = {2018},
pages = {10-19},
doi = {10.1109/WACV.2018.00008},
url = {https://mlanthology.org/wacv/2018/abtahi2018wacv-emotion/}
}