Bi-Level Multi-Column Convolutional Neural Networks for Facial Landmark Point Detection
Abstract
We propose a bi-level Multi-column Convolutional Neural Networks (MCNNs) framework for face alignment. Global CNNs are used to roughly estimate the coordinates of all landmark points, and Local CNNs take patches sampled from the landmarks predicted by Global CNNs as input to predict the displacement between the ground truth and the landmark predicted by Global CNNs. The multi-column architecture leverages the findings that the optimal resolutions for different points are different. Further, the coordinates of all landmark and their displacement are simultaneously estimated in Global and Local CNNs, hence global shape constraints are naturally and implicitly imposed to make it very robust to significant variations in pose, expression, occlusion, and illumination. Extensive experiments demonstrate our method achieves state of the art performance for both image and video based face alignment on many publicly available datasets.
Cite
Text
Xu and Gao. "Bi-Level Multi-Column Convolutional Neural Networks for Facial Landmark Point Detection." European Conference on Computer Vision, 2016. doi:10.1007/978-3-319-48881-3_37Markdown
[Xu and Gao. "Bi-Level Multi-Column Convolutional Neural Networks for Facial Landmark Point Detection." European Conference on Computer Vision, 2016.](https://mlanthology.org/eccv/2016/xu2016eccv-bi/) doi:10.1007/978-3-319-48881-3_37BibTeX
@inproceedings{xu2016eccv-bi,
title = {{Bi-Level Multi-Column Convolutional Neural Networks for Facial Landmark Point Detection}},
author = {Xu, Yanyu and Gao, Shenghua},
booktitle = {European Conference on Computer Vision},
year = {2016},
pages = {536-551},
doi = {10.1007/978-3-319-48881-3_37},
url = {https://mlanthology.org/eccv/2016/xu2016eccv-bi/}
}