Better Aggregation in Test-Time Augmentation

Abstract

Test-time augmentation---the aggregation of predictions across transformed versions of a test input---is a common practice in image classification. Traditionally, predictions are combined using a simple average. In this paper, we present 1) experimental analyses that shed light on cases in which the simple average is suboptimal and 2) a method to address these shortcomings. A key finding is that even when test-time augmentation produces a net improvement in accuracy, it can change many correct predictions into incorrect predictions. We delve into when and why test-time augmentation changes a prediction from being correct to incorrect and vice versa. Building on these insights, we present a learning-based method for aggregating test-time augmentations. Experiments across a diverse set of models, datasets, and augmentations show that our method delivers consistent improvements over existing approaches.

Cite

Text

Shanmugam et al. "Better Aggregation in Test-Time Augmentation." International Conference on Computer Vision, 2021. doi:10.1109/ICCV48922.2021.00125

Markdown

[Shanmugam et al. "Better Aggregation in Test-Time Augmentation." International Conference on Computer Vision, 2021.](https://mlanthology.org/iccv/2021/shanmugam2021iccv-better/) doi:10.1109/ICCV48922.2021.00125

BibTeX

@inproceedings{shanmugam2021iccv-better,
  title     = {{Better Aggregation in Test-Time Augmentation}},
  author    = {Shanmugam, Divya and Blalock, Davis and Balakrishnan, Guha and Guttag, John},
  booktitle = {International Conference on Computer Vision},
  year      = {2021},
  pages     = {1214-1223},
  doi       = {10.1109/ICCV48922.2021.00125},
  url       = {https://mlanthology.org/iccv/2021/shanmugam2021iccv-better/}
}