Learning Spatio-Temporal Features with Partial Expression Sequences for On-the-Fly Prediction

Abstract

Spatio-temporal feature encoding is essential for encoding facial expression dynamics in video sequences. At test time, most spatio-temporal encoding methods assume that a temporally segmented sequence is fed to a learned model, which could require the prediction to wait until the full sequence is available to an auxiliary task that performs the temporal segmentation. This causes a delay in predicting the expression. In an interactive setting, such as affective interactive agents, such delay in the prediction could not be tolerated. Therefore, training a model that can accurately predict the facial expression "on-the-fly" (as they are fed to the system) is essential. In this paper, we propose a new spatio-temporal feature learning method, which would allow prediction with partial sequences. As such, the prediction could be performed on-the-fly. The proposed method utilizes an estimated expression intensity to generate dense labels, which are used to regulate the prediction model training with a novel objective function. As results, the learned spatio-temporal features can robustly predict the expression with partial (incomplete) expression sequences, on-the-fly. Experimental results showed that the proposed method achieved higher recognition rates compared to the state-of-the-art methods on both datasets. More importantly, the results verified that the proposed method improved the prediction frames with partial expression sequence inputs.

Cite

Text

Baddar and Ro. "Learning Spatio-Temporal Features with Partial Expression Sequences for On-the-Fly Prediction." AAAI Conference on Artificial Intelligence, 2018. doi:10.1609/AAAI.V32I1.12332

Markdown

[Baddar and Ro. "Learning Spatio-Temporal Features with Partial Expression Sequences for On-the-Fly Prediction." AAAI Conference on Artificial Intelligence, 2018.](https://mlanthology.org/aaai/2018/baddar2018aaai-learning/) doi:10.1609/AAAI.V32I1.12332

BibTeX

@inproceedings{baddar2018aaai-learning,
  title     = {{Learning Spatio-Temporal Features with Partial Expression Sequences for On-the-Fly Prediction}},
  author    = {Baddar, Wissam J. and Ro, Yong Man},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {2018},
  pages     = {6666-6673},
  doi       = {10.1609/AAAI.V32I1.12332},
  url       = {https://mlanthology.org/aaai/2018/baddar2018aaai-learning/}
}