Learning to Transform for Generalizable Instance-Wise Invariance

Abstract

Computer vision research has long aimed to build systems that are robust to transformations found in natural data. Traditionally, this is done using data augmentation or hard-coding invariances into the architecture. However, too much or too little invariance can hurt, and the correct amount is unknown a priori and dependent on the instance. Ideally, the appropriate invariance would be learned from data and inferred at test-time. We treat invariance as a prediction problem. Given any image, we predict a distribution over transformations. We use variational inference to learn this distribution end-to-end. Combined with a graphical model approach, this distribution forms a flexible, generalizable, and adaptive form of invariance. Our experiments show that it can be used to align datasets and discover prototypes, adapt to out-of-distribution poses, and generalize invariances across classes. When used for data augmentation, our method shows consistent gains in accuracy and robustness on CIFAR 10, CIFAR10-LT, and TinyImageNet.

Cite

Text

Singhal et al. "Learning to Transform for Generalizable Instance-Wise Invariance." International Conference on Computer Vision, 2023. doi:10.1109/ICCV51070.2023.00571

Markdown

[Singhal et al. "Learning to Transform for Generalizable Instance-Wise Invariance." International Conference on Computer Vision, 2023.](https://mlanthology.org/iccv/2023/singhal2023iccv-learning/) doi:10.1109/ICCV51070.2023.00571

BibTeX

@inproceedings{singhal2023iccv-learning,
  title     = {{Learning to Transform for Generalizable Instance-Wise Invariance}},
  author    = {Singhal, Utkarsh and Esteves, Carlos and Makadia, Ameesh and Yu, Stella X.},
  booktitle = {International Conference on Computer Vision},
  year      = {2023},
  pages     = {6211-6221},
  doi       = {10.1109/ICCV51070.2023.00571},
  url       = {https://mlanthology.org/iccv/2023/singhal2023iccv-learning/}
}