Low Quality Video Face Recognition: Multi-Mode Aggregation Recurrent Network (MARN)

Abstract

Face recognition performance deteriorates when face images are of very low quality. For low quality video sequences, however, more discriminative features can be obtained by aggregating the information in video frames. We propose a Multi-mode Aggregation Recurrent Network (MARN) for real-world low-quality video face recognition. Unlike existing recurrent networks (RNNs), MARN is robust against overfitting since it learns to aggregate pre-trained embeddings. Compared with quality-aware aggregation methods, MARN utilizes the video context and learns multiple attention vectors adaptively. Empirical results on three video face recognition datasets, IJB-S, YTF, and PaSC show that MARN significantly boosts the performance on the low quality video dataset while achieves comparable results on high quality video datasets.

Cite

Text

Gong et al. "Low Quality Video Face Recognition: Multi-Mode Aggregation Recurrent Network (MARN)." IEEE/CVF International Conference on Computer Vision Workshops, 2019. doi:10.1109/ICCVW.2019.00132

Markdown

[Gong et al. "Low Quality Video Face Recognition: Multi-Mode Aggregation Recurrent Network (MARN)." IEEE/CVF International Conference on Computer Vision Workshops, 2019.](https://mlanthology.org/iccvw/2019/gong2019iccvw-low/) doi:10.1109/ICCVW.2019.00132

BibTeX

@inproceedings{gong2019iccvw-low,
  title     = {{Low Quality Video Face Recognition: Multi-Mode Aggregation Recurrent Network (MARN)}},
  author    = {Gong, Sixue and Shi, Yichun and Jain, Anil K.},
  booktitle = {IEEE/CVF International Conference on Computer Vision Workshops},
  year      = {2019},
  pages     = {1027-1035},
  doi       = {10.1109/ICCVW.2019.00132},
  url       = {https://mlanthology.org/iccvw/2019/gong2019iccvw-low/}
}