Making Convolutional Networks Shift-Invariant Again

ICML 2019 pp. 7324-7334

/icml/2019/zhang2019icml-making/

Abstract

Modern convolutional networks are not shift-invariant, as small input shifts or translations can cause drastic changes in the output. Commonly used downsampling methods, such as max-pooling, strided-convolution, and average-pooling, ignore the sampling theorem. The well-known signal processing fix is anti-aliasing by low-pass filtering before downsampling. However, simply inserting this module into deep networks leads to performance degradation; as a result, it is seldomly used today. We show that when integrated correctly, it is compatible with existing architectural components, such as max-pooling. The technique is general and can be incorporated across layer types and applications, such as image classification and conditional image generation. In addition to increased shift-invariance, we also observe, surprisingly, that anti-aliasing boosts accuracy in ImageNet classification, across several commonly-used architectures. This indicates that anti-aliasing serves as effective regularization. Our results demonstrate that this classical signal processing technique has been undeservingly overlooked in modern deep networks.

PDF ICML Semantic Scholar

Cite

Text

Zhang. "Making Convolutional Networks Shift-Invariant Again." International Conference on Machine Learning, 2019.

Markdown

[Zhang. "Making Convolutional Networks Shift-Invariant Again." International Conference on Machine Learning, 2019.](https://mlanthology.org/icml/2019/zhang2019icml-making/)

BibTeX

@inproceedings{zhang2019icml-making,
  title     = {{Making Convolutional Networks Shift-Invariant Again}},
  author    = {Zhang, Richard},
  booktitle = {International Conference on Machine Learning},
  year      = {2019},
  pages     = {7324-7334},
  volume    = {97},
  url       = {https://mlanthology.org/icml/2019/zhang2019icml-making/}
}