Making Convolutional Networks Shift-Invariant Again

Abstract

Modern convolutional networks are not shift-invariant, as small input shifts or translations can cause drastic changes in the output. Commonly used downsampling methods, such as max-pooling, strided-convolution, and average-pooling, ignore the sampling theorem. The well-known signal processing fix is anti-aliasing by low-pass filtering before downsampling. However, simply inserting this module into deep networks leads to performance degradation; as a result, it is seldomly used today. We show that when integrated correctly, it is compatible with existing architectural components, such as max-pooling. The technique is general and can be incorporated across layer types and applications, such as image classification and conditional image generation. In addition to increased shift-invariance, we also observe, surprisingly, that anti-aliasing boosts accuracy in ImageNet classification, across several commonly-used architectures. This indicates that anti-aliasing serves as effective regularization. Our results demonstrate that this classical signal processing technique has been undeservingly overlooked in modern deep networks.

Cite

Text

Zhang. "Making Convolutional Networks Shift-Invariant Again." International Conference on Machine Learning, 2019.

Markdown

[Zhang. "Making Convolutional Networks Shift-Invariant Again." International Conference on Machine Learning, 2019.](https://mlanthology.org/icml/2019/zhang2019icml-making/)

BibTeX

@inproceedings{zhang2019icml-making,
  title     = {{Making Convolutional Networks Shift-Invariant Again}},
  author    = {Zhang, Richard},
  booktitle = {International Conference on Machine Learning},
  year      = {2019},
  pages     = {7324-7334},
  volume    = {97},
  url       = {https://mlanthology.org/icml/2019/zhang2019icml-making/}
}