Spatio-Temporal Embedding for Statistical Face Recognition from Video
Abstract
This paper addresses the problem of how to learn an appropriate feature representation from video to benefit video-based face recognition. By simultaneously exploiting the spatial and temporal information, the problem is posed as learning Spatio-Temporal Embedding (STE) from raw video. STE of a video sequence is defined as its condensed version capturing the essence of space-time characteristics of the video. Relying on the co-occurrence statistics and supervised signatures provided by training videos, STE preserves the intrinsic temporal structures hidden in video volume, meanwhile encodes the discriminative cues into the spatial domain. To conduct STE, we propose two novel techniques, Bayesian keyframe learning and nonparametric discriminant embedding (NDE), for temporal and spatial learning, respectively. In terms of learned STEs, we derive a statistical formulation to the recognition problem with a probabilistic fusion model. On a large face video database containing more than 200 training and testing sequences, our approach consistently outperforms state-of-the-art methods, achieving a perfect recognition accuracy.
Cite
Text
Liu et al. "Spatio-Temporal Embedding for Statistical Face Recognition from Video." European Conference on Computer Vision, 2006. doi:10.1007/11744047_29Markdown
[Liu et al. "Spatio-Temporal Embedding for Statistical Face Recognition from Video." European Conference on Computer Vision, 2006.](https://mlanthology.org/eccv/2006/liu2006eccv-spatio/) doi:10.1007/11744047_29BibTeX
@inproceedings{liu2006eccv-spatio,
title = {{Spatio-Temporal Embedding for Statistical Face Recognition from Video}},
author = {Liu, Wei and Li, Zhifeng and Tang, Xiaoou},
booktitle = {European Conference on Computer Vision},
year = {2006},
pages = {374-388},
doi = {10.1007/11744047_29},
url = {https://mlanthology.org/eccv/2006/liu2006eccv-spatio/}
}