Making Convolutional Networks Shift-Invariant Again
Abstract
Modern convolutional networks are not shift-invariant, as small input shifts or translations can cause drastic changes in the output. Commonly used downsampling methods, such as max-pooling, strided-convolution, and average-pooling, ignore the sampling theorem. The well-known signal processing fix is anti-aliasing by low-pass filtering before downsampling. However, simply inserting this module into deep networks leads to performance degradation; as a result, it is seldomly used today. We show that when integrated correctly, it is compatible with existing architectural components, such as max-pooling. The technique is general and can be incorporated across layer types and applications, such as image classification and conditional image generation. In addition to increased shift-invariance, we also observe, surprisingly, that anti-aliasing boosts accuracy in ImageNet classification, across several commonly-used architectures. This indicates that anti-aliasing serves as effective regularization. Our results demonstrate that this classical signal processing technique has been undeservingly overlooked in modern deep networks.
Cite
Text
Zhang. "Making Convolutional Networks Shift-Invariant Again." International Conference on Machine Learning, 2019.Markdown
[Zhang. "Making Convolutional Networks Shift-Invariant Again." International Conference on Machine Learning, 2019.](https://mlanthology.org/icml/2019/zhang2019icml-making/)BibTeX
@inproceedings{zhang2019icml-making,
title = {{Making Convolutional Networks Shift-Invariant Again}},
author = {Zhang, Richard},
booktitle = {International Conference on Machine Learning},
year = {2019},
pages = {7324-7334},
volume = {97},
url = {https://mlanthology.org/icml/2019/zhang2019icml-making/}
}