KPNet: Towards Minimal Face Detector

Abstract

The small receptive field and capacity of minimal neural networks limit their performance when using them to be the backbone of detectors. In this work, we find that the appearance feature of a generic face is discriminative enough for a tiny and shallow neural network to verify from the background. And the essential barriers behind us are 1) the vague definition of the face bounding box and 2) tricky design of anchor-boxes or receptive field. Unlike most top-down methods for joint face detection and alignment, the proposed KPNet detects small facial keypoints instead of the whole face by in a bottom-up manner. It first predicts the facial landmarks from a low-resolution image via the well-designed fine-grained scale approximation and scale adaptive soft-argmax operator. Finally, the precise face bounding boxes, no matter how we define it, can be inferred from the keypoints. Without any complex head architecture or meticulous network designing, the KPNet achieves state-of-the-art accuracy on generic face detection and alignment benchmarks with only $\sim1M$ parameters, which runs at 1000fps on GPU and is easy to perform real-time on most modern front-end chips.

Cite

Text

Song et al. "KPNet: Towards Minimal Face Detector." AAAI Conference on Artificial Intelligence, 2020. doi:10.1609/AAAI.V34I07.6878

Markdown

[Song et al. "KPNet: Towards Minimal Face Detector." AAAI Conference on Artificial Intelligence, 2020.](https://mlanthology.org/aaai/2020/song2020aaai-kpnet/) doi:10.1609/AAAI.V34I07.6878

BibTeX

@inproceedings{song2020aaai-kpnet,
  title     = {{KPNet: Towards Minimal Face Detector}},
  author    = {Song, Guanglu and Liu, Yu and Zang, Yuhang and Wang, Xiaogang and Leng, Biao and Yuan, Qingsheng},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {2020},
  pages     = {12015-12022},
  doi       = {10.1609/AAAI.V34I07.6878},
  url       = {https://mlanthology.org/aaai/2020/song2020aaai-kpnet/}
}