Learning Multi-Attention Convolutional Neural Network for Fine-Grained Image Recognition

Zheng, Heliang; Fu, Jianlong; Mei, Tao; Luo, Jiebo

doi:10.1109/ICCV.2017.557

Learning Multi-Attention Convolutional Neural Network for Fine-Grained Image Recognition

Heliang Zheng, Jianlong Fu, Tao Mei, Jiebo Luo

ICCV 2017

doi:10.1109/ICCV.2017.557 /iccv/2017/zheng2017iccv-learning-a/

Abstract

Recognizing fine-grained categories (e.g., bird species) highly relies on discriminative part localization and part-based fine-grained feature learning. Existing approaches predominantly solve these challenges independently, while neglecting the fact that part localization (e.g., head of a bird) and fine-grained feature learning (e.g., head shape) are mutually correlated. In this paper, we propose a novel part learning approach by a multi-attention convolutional neural network (MA-CNN), where part generation and feature learning can reinforce each other. MA-CNN consists of convolution, channel grouping and part classification sub-networks. The channel grouping network takes as input feature channels from convolutional layers, and generates multiple parts by clustering, weighting and pooling from spatially-correlated channels. The part classification network further classifies an image by each individual part, through which more discriminative fine-grained features can be learned. Two losses are proposed to guide the multi-task learning of channel grouping and part classification, which encourages MA-CNN to generate more discriminative parts from feature channels and learn better fine-grained features from parts in a mutual reinforced way. MA-CNN does not need bounding box/part annotation and can be trained end-to-end. We incorporate the learned parts from MA-CNN with part-CNN for recognition, and show the best performances on three challenging published fine-grained datasets, e.g., CUB-Birds, FGVC-Aircraft and Stanford-Cars.

PDF ICCV Semantic Scholar

Cite

Text

Zheng et al. "Learning Multi-Attention Convolutional Neural Network for Fine-Grained Image Recognition." International Conference on Computer Vision, 2017. doi:10.1109/ICCV.2017.557

Markdown

[Zheng et al. "Learning Multi-Attention Convolutional Neural Network for Fine-Grained Image Recognition." International Conference on Computer Vision, 2017.](https://mlanthology.org/iccv/2017/zheng2017iccv-learning-a/) doi:10.1109/ICCV.2017.557

BibTeX

@inproceedings{zheng2017iccv-learning-a,
  title     = {{Learning Multi-Attention Convolutional Neural Network for Fine-Grained Image Recognition}},
  author    = {Zheng, Heliang and Fu, Jianlong and Mei, Tao and Luo, Jiebo},
  booktitle = {International Conference on Computer Vision},
  year      = {2017},
  doi       = {10.1109/ICCV.2017.557},
  url       = {https://mlanthology.org/iccv/2017/zheng2017iccv-learning-a/}
}