Beyond Spatial Pyramids: Receptive Field Learning for Pooled Image Features

Abstract

In this paper we examine the effect of receptive field designs on classification accuracy in the commonly adopted pipeline of image classification. While existing algorithms usually use manually defined spatial regions for pooling, we show that learning more adaptive receptive fields increases performance even with a significantly smaller codebook size at the coding layer. To learn the optimal pooling parameters, we adopt the idea of over-completeness by starting with a large number of receptive field candidates, and train a classifier with structured sparsity to only use a sparse subset of all the features. An efficient algorithm based on incremental feature selection and retraining is proposed for fast learning. With this method, we achieve the best published performance on the CIFAR-10 dataset, using a much lower dimensional feature space than previous methods.

Cite

Text

Jia et al. "Beyond Spatial Pyramids: Receptive Field Learning for Pooled Image Features." IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2012. doi:10.1109/CVPR.2012.6248076

Markdown

[Jia et al. "Beyond Spatial Pyramids: Receptive Field Learning for Pooled Image Features." IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2012.](https://mlanthology.org/cvpr/2012/jia2012cvpr-beyond/) doi:10.1109/CVPR.2012.6248076

BibTeX

@inproceedings{jia2012cvpr-beyond,
  title     = {{Beyond Spatial Pyramids: Receptive Field Learning for Pooled Image Features}},
  author    = {Jia, Yangqing and Huang, Chang and Darrell, Trevor},
  booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  year      = {2012},
  pages     = {3370-3377},
  doi       = {10.1109/CVPR.2012.6248076},
  url       = {https://mlanthology.org/cvpr/2012/jia2012cvpr-beyond/}
}