Valence and Arousal Estimation Based on Multimodal Temporal-Aware Features for Videos in the Wild

Abstract

This paper presents our submission to the Valence-Arousal Estimation Challenge of the 3rd Affective Behavior Analysis in-the-wild (ABAW) competition. Based on multimodal feature representations that fuse the visual and aural information, we utilize two types of temporal encoder to capture the temporal context information in the video, including the transformer based encoder and LSTM based encoder. With the temporal context-aware representations, we employ fully-connected layers to predict the valence and arousal values of the video frames. In addition, smoothing processing is applied to refine the initial predictions, and a model ensemble strategy is used to combine multiple results from different model setups. Our system achieves the performance in Concordance Correlation Coefficients (ccc) of 0.606 for valence, 0.602 for arousal, and mean ccc of 0.601, which ranks the first place in the challenge.

Cite

Text

Meng et al. "Valence and Arousal Estimation Based on Multimodal Temporal-Aware Features for Videos in the Wild." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2022. doi:10.1109/CVPRW56347.2022.00261

Markdown

[Meng et al. "Valence and Arousal Estimation Based on Multimodal Temporal-Aware Features for Videos in the Wild." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2022.](https://mlanthology.org/cvprw/2022/meng2022cvprw-valence/) doi:10.1109/CVPRW56347.2022.00261

BibTeX

@inproceedings{meng2022cvprw-valence,
  title     = {{Valence and Arousal Estimation Based on Multimodal Temporal-Aware Features for Videos in the Wild}},
  author    = {Meng, Liyu and Liu, Yuchen and Liu, Xiaolong and Huang, Zhaopei and Jiang, Wenqiang and Zhang, Tenggan and Liu, Chuanhe and Jin, Qin},
  booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops},
  year      = {2022},
  pages     = {2344-2351},
  doi       = {10.1109/CVPRW56347.2022.00261},
  url       = {https://mlanthology.org/cvprw/2022/meng2022cvprw-valence/}
}