SPDA-CNN: Unifying Semantic Part Detection and Abstraction for Fine-Grained Recognition

Zhang, Han; Xu, Tao; Elhoseiny, Mohamed; Huang, Xiaolei; Zhang, Shaoting; Elgammal, Ahmed; Metaxas, Dimitris

doi:10.1109/CVPR.2016.129

SPDA-CNN: Unifying Semantic Part Detection and Abstraction for Fine-Grained Recognition

Han Zhang, Tao Xu, Mohamed Elhoseiny, Xiaolei Huang, Shaoting Zhang, Ahmed Elgammal, Dimitris Metaxas

CVPR 2016

doi:10.1109/CVPR.2016.129 /cvpr/2016/zhang2016cvpr-spdacnn/

Abstract

Most convolutional neural networks (CNNs) lack midlevel layers that model semantic parts of objects. This limits CNN-based methods from reaching their full potential in detecting and utilizing small semantic parts in recognition. Introducing such mid-level layers can facilitate the extraction of part-specific features which can be utilized for better recognition performance. This is particularly important in the domain of fine-grained recognition. In this paper, we propose a new CNN architecture that integrates semantic part detection and abstraction (SPDA-CNN) for fine-grained classification. The proposed network has two sub-networks: one for detection and one for recognition. The detection sub-network has a novel top-down proposal method to generate small semantic part candidates for detection. The classification sub-network introduces novel part layers that extract features from parts detected by the detection sub-network, and combine them for recognition. As a result, the proposed architecture provides an end-to-end network that performs detection, localization of multiple semantic parts, and whole object recognition within one framework that shares the computation of convolutional filters. Our method outperforms state-of-the-art methods with a large margin for small parts detection (e.g. our precision of 93.40% vs the best previous precision of 74.00% for detecting the head on CUB-2011). It also compares favorably to the existing state-of-the-art on fine-grained classification, e.g. it achieves 85.14% accuracy on CUB-2011.

PDF CVPR Semantic Scholar

Cite

Text

Zhang et al. "SPDA-CNN: Unifying Semantic Part Detection and Abstraction for Fine-Grained Recognition." Conference on Computer Vision and Pattern Recognition, 2016. doi:10.1109/CVPR.2016.129

Markdown

[Zhang et al. "SPDA-CNN: Unifying Semantic Part Detection and Abstraction for Fine-Grained Recognition." Conference on Computer Vision and Pattern Recognition, 2016.](https://mlanthology.org/cvpr/2016/zhang2016cvpr-spdacnn/) doi:10.1109/CVPR.2016.129

BibTeX

@inproceedings{zhang2016cvpr-spdacnn,
  title     = {{SPDA-CNN: Unifying Semantic Part Detection and Abstraction for Fine-Grained Recognition}},
  author    = {Zhang, Han and Xu, Tao and Elhoseiny, Mohamed and Huang, Xiaolei and Zhang, Shaoting and Elgammal, Ahmed and Metaxas, Dimitris},
  booktitle = {Conference on Computer Vision and Pattern Recognition},
  year      = {2016},
  doi       = {10.1109/CVPR.2016.129},
  url       = {https://mlanthology.org/cvpr/2016/zhang2016cvpr-spdacnn/}
}