Disentangling Factors of Variation for Facial Expression Recognition

Abstract

We propose a semi-supervised approach to solve the task of emotion recognition in 2D face images using recent ideas in deep learning for handling the factors of variation present in data. An emotion classification algorithm should be both robust to (1) remaining variations due to the pose of the face in the image after centering and alignment, (2) the identity or morphology of the face. In order to achieve this invariance, we propose to learn a hierarchy of features in which we gradually filter the factors of variation arising from both (1) and (2). We address (1) by using a multi-scale contractive convolutional network (CCNET) in order to obtain invariance to translations of the facial traits in the image. Using the feature representation produced by the CCNET, we train a Contractive Discriminative Analysis (CDA) feature extractor, a novel variant of the Contractive Auto-Encoder (CAE), designed to learn a representation separating out the emotion-related factors from the others (which mostly capture the subject identity, and what is left of pose after the CCNET). This system beats the state-of-the-art on a recently proposed dataset for facial expression recognition, the Toronto Face Database, moving the state-of-art accuracy from 82.4% to 85.0%, while the CCNET and CDA improve accuracy of a standard CAE by 8%.

Cite

Text

Rifai et al. "Disentangling Factors of Variation for Facial Expression Recognition." European Conference on Computer Vision, 2012. doi:10.1007/978-3-642-33783-3_58

Markdown

[Rifai et al. "Disentangling Factors of Variation for Facial Expression Recognition." European Conference on Computer Vision, 2012.](https://mlanthology.org/eccv/2012/rifai2012eccv-disentangling/) doi:10.1007/978-3-642-33783-3_58

BibTeX

@inproceedings{rifai2012eccv-disentangling,
  title     = {{Disentangling Factors of Variation for Facial Expression Recognition}},
  author    = {Rifai, Salah and Bengio, Yoshua and Courville, Aaron C. and Vincent, Pascal and Mirza, Mehdi},
  booktitle = {European Conference on Computer Vision},
  year      = {2012},
  pages     = {808-822},
  doi       = {10.1007/978-3-642-33783-3_58},
  url       = {https://mlanthology.org/eccv/2012/rifai2012eccv-disentangling/}
}