Fixing the Train-Test Resolution Discrepancy

Abstract

Data-augmentation is key to the training of neural networks for image classification. This paper first shows that existing augmentations induce a significant discrepancy between the size of the objects seen by the classifier at train and test time: in fact, a lower train resolution improves the classification at test time! We then propose a simple strategy to optimize the classifier performance, that employs different train and test resolutions. It relies on a computationally cheap fine-tuning of the network at the test resolution. This enables training strong classifiers using small training images, and therefore significantly reduce the training time. For instance, we obtain 77.1% top-1 accuracy on ImageNet with a ResNet-50 trained on 128x128 images, and 79.8% with one trained at 224x224. A ResNeXt-101 32x48d pre-trained with weak supervision on 940 million 224x224 images and further optimized with our technique for test resolution 320x320 achieves 86.4% top-1 accuracy (top-5: 98.0%). To the best of our knowledge this is the highest ImageNet single-crop accuracy to date.

Cite

Text

Touvron et al. "Fixing the Train-Test Resolution Discrepancy." Neural Information Processing Systems, 2019.

Markdown

[Touvron et al. "Fixing the Train-Test Resolution Discrepancy." Neural Information Processing Systems, 2019.](https://mlanthology.org/neurips/2019/touvron2019neurips-fixing/)

BibTeX

@inproceedings{touvron2019neurips-fixing,
  title     = {{Fixing the Train-Test Resolution Discrepancy}},
  author    = {Touvron, Hugo and Vedaldi, Andrea and Douze, Matthijs and Jegou, Herve},
  booktitle = {Neural Information Processing Systems},
  year      = {2019},
  pages     = {8252-8262},
  url       = {https://mlanthology.org/neurips/2019/touvron2019neurips-fixing/}
}