Modeling Local and Global Deformations in Deep Learning: Epitomic Convolution, Multiple Instance Learning, and Sliding Window Detection

Abstract

Deep Convolutional Neural Networks (DCNNs) achieve invariance to domain transformations (deformations) by using multiple 'max-pooling' (MP) layers. In this work we show that alternative methods of modeling deformations can improve the accuracy and efficiency of DCNNs. First, we introduce epitomic convolution as an alternative to the common convolution-MP cascade of DCNNs, that comes with the same computational cost but favorable learning properties. Second, we introduce a Multiple Instance Learning algorithm to accommodate global translation and scaling in image classification, yielding an efficient algorithm that trains and tests a DCNN in a consistent manner. Third we develop a DCNN sliding window detector that explicitly, but efficiently, searches over the object's position, scale, and aspect ratio. We provide competitive image classification and localization results on the ImageNet dataset and object detection results on Pascal VOC2007.

Cite

Text

Papandreou et al. "Modeling Local and Global Deformations in Deep Learning: Epitomic Convolution, Multiple Instance Learning, and Sliding Window Detection." Conference on Computer Vision and Pattern Recognition, 2015. doi:10.1109/CVPR.2015.7298636

Markdown

[Papandreou et al. "Modeling Local and Global Deformations in Deep Learning: Epitomic Convolution, Multiple Instance Learning, and Sliding Window Detection." Conference on Computer Vision and Pattern Recognition, 2015.](https://mlanthology.org/cvpr/2015/papandreou2015cvpr-modeling/) doi:10.1109/CVPR.2015.7298636

BibTeX

@inproceedings{papandreou2015cvpr-modeling,
  title     = {{Modeling Local and Global Deformations in Deep Learning: Epitomic Convolution, Multiple Instance Learning, and Sliding Window Detection}},
  author    = {Papandreou, George and Kokkinos, Iasonas and Savalle, Pierre-Andre},
  booktitle = {Conference on Computer Vision and Pattern Recognition},
  year      = {2015},
  doi       = {10.1109/CVPR.2015.7298636},
  url       = {https://mlanthology.org/cvpr/2015/papandreou2015cvpr-modeling/}
}