Enhancing Emotion Recognition with Pre-Trained Masked Autoencoders and Sequential Learning
Abstract
Human emotion recognition plays a pivotal role in facilitating seamless interactions between humans and computers. This paper delineates our methodology in tackling the Valence-Arousal (VA) Estimation Challenge, Expression (Expr) Recognition Challenge, and Action Unit (AU) Detection Challenge within the ambit of the 6th Workshop and Competition on Affective Behavior Analysis in-the-wild (ABAW). Our study advocates a novel approach aimed at refining continuous emotion recognition. We achieve this by first pre-training with Masked Autoencoders (MAE) on facial datasets and then fine-tuning the model on the aff-wild2 dataset, which is annotated with expression (Expr) labels. The pre-trained model serves as an adept visual feature extractor, thereby enhancing the model’s robustness. Furthermore, we bolster the performance of continuous emotion recognition by integrating Temporal Convolutional Network (TCN) modules and Transformer Encoder modules into our framework. Our model excels beyond baseline performance, securing a commendable 3rd place in the Valence-Arousal Estimation Challenge, while also achieving an impressive 2nd place in both the Expression Recognition Challenge and the Action Unit Detection Challenge.
Cite
Text
Zhou et al. "Enhancing Emotion Recognition with Pre-Trained Masked Autoencoders and Sequential Learning." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2024. doi:10.1109/CVPRW63382.2024.00469Markdown
[Zhou et al. "Enhancing Emotion Recognition with Pre-Trained Masked Autoencoders and Sequential Learning." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2024.](https://mlanthology.org/cvprw/2024/zhou2024cvprw-enhancing/) doi:10.1109/CVPRW63382.2024.00469BibTeX
@inproceedings{zhou2024cvprw-enhancing,
title = {{Enhancing Emotion Recognition with Pre-Trained Masked Autoencoders and Sequential Learning}},
author = {Zhou, Weiwei and Lu, Jiada and Ling, Chenkun and Wang, Weifeng and Liu, Shaowei},
booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops},
year = {2024},
pages = {4666-4672},
doi = {10.1109/CVPRW63382.2024.00469},
url = {https://mlanthology.org/cvprw/2024/zhou2024cvprw-enhancing/}
}