Learning Deep Features for Discriminative Localization

Zhou, Bolei; Khosla, Aditya; Lapedriza, Agata; Oliva, Aude; Torralba, Antonio

doi:10.1109/CVPR.2016.319

Learning Deep Features for Discriminative Localization

Bolei Zhou, Aditya Khosla, Agata Lapedriza, Aude Oliva, Antonio Torralba

CVPR 2016

doi:10.1109/CVPR.2016.319 /cvpr/2016/zhou2016cvpr-learning-a/

Abstract

In this work, we revisit the global average pooling layer proposed in [13], and shed light on how it explicitly enables the convolutional neural network (CNN) to have remarkable localization ability despite being trained on image-level labels. While this technique was previously proposed as a means for regularizing training, we find that it actually builds a generic localizable deep representation that exposes the implicit attention of CNNs on image. Despite the apparent simplicity of global average pooling, we are able to achieve 37.1% top-5 error for object localization on ILSVRC 2014 without training on any bounding box annotation. We demonstrate that our network is able to localize the discriminative image regions on a variety of tasks despite not being trained for them.

PDF CVPR Semantic Scholar

Cite

Text

Zhou et al. "Learning Deep Features for Discriminative Localization." Conference on Computer Vision and Pattern Recognition, 2016. doi:10.1109/CVPR.2016.319

Markdown

[Zhou et al. "Learning Deep Features for Discriminative Localization." Conference on Computer Vision and Pattern Recognition, 2016.](https://mlanthology.org/cvpr/2016/zhou2016cvpr-learning-a/) doi:10.1109/CVPR.2016.319

BibTeX

@inproceedings{zhou2016cvpr-learning-a,
  title     = {{Learning Deep Features for Discriminative Localization}},
  author    = {Zhou, Bolei and Khosla, Aditya and Lapedriza, Agata and Oliva, Aude and Torralba, Antonio},
  booktitle = {Conference on Computer Vision and Pattern Recognition},
  year      = {2016},
  doi       = {10.1109/CVPR.2016.319},
  url       = {https://mlanthology.org/cvpr/2016/zhou2016cvpr-learning-a/}
}