Compressing the Input for CNNs with the First-Order Scattering Transform

Abstract

We consider the first-order scattering transform as a candidate for reducing the signal processed by a convolutional neural network (CNN). We study this transformation and show theoretical and empirical evidence that in the case of natural images and sufficiently small translation invariance, this transform preserves most of the signal information needed for classification while substantially reducing the spatial resolution and total signal size. We demonstrate that cascading a CNN with this representations permits to perform on par with Imagenet classification models commonly used in downstream tasks such as the Resnet-50. We subsequently apply our Imagenet trained hybrid model as a base model on a detection system, which typically has larger image inputs. On Pascal VOC and COCO detection tasks we find this leads to substantial improvements in inference speed and training memory consumption compared to models trained directly on the input image.

Cite

Text

Oyallon et al. "Compressing the Input for CNNs with the First-Order Scattering Transform." Proceedings of the European Conference on Computer Vision (ECCV), 2018. doi:10.1007/978-3-030-01240-3_19

Markdown

[Oyallon et al. "Compressing the Input for CNNs with the First-Order Scattering Transform." Proceedings of the European Conference on Computer Vision (ECCV), 2018.](https://mlanthology.org/eccv/2018/oyallon2018eccv-compressing/) doi:10.1007/978-3-030-01240-3_19

BibTeX

@inproceedings{oyallon2018eccv-compressing,
  title     = {{Compressing the Input for CNNs with the First-Order Scattering Transform}},
  author    = {Oyallon, Edouard and Belilovsky, Eugene and Zagoruyko, Sergey and Valko, Michal},
  booktitle = {Proceedings of the European Conference on Computer Vision (ECCV)},
  year      = {2018},
  doi       = {10.1007/978-3-030-01240-3_19},
  url       = {https://mlanthology.org/eccv/2018/oyallon2018eccv-compressing/}
}