Improved OOD Generalization via Adversarial Training and Pretraing

Abstract

Recently, learning a model that generalizes well on out-of-distribution (OOD) data has attracted great attention in the machine learning community. In this paper, after defining OOD generalization by Wasserstein distance, we theoretically justify that a model robust to input perturbation also generalizes well on OOD data. Inspired by previous findings that adversarial training helps improve robustness, we show that models trained by adversarial training have converged excess risk on OOD data. Besides, in the paradigm of pre-training then fine-tuning, we theoretically justify that the input perturbation robust model in the pre-training stage provides an initialization that generalizes well on downstream OOD data. Finally, various experiments conducted on image classification and natural language understanding tasks verify our theoretical findings.

Cite

Text

Yi et al. "Improved OOD Generalization via Adversarial Training and Pretraing." International Conference on Machine Learning, 2021.

Markdown

[Yi et al. "Improved OOD Generalization via Adversarial Training and Pretraing." International Conference on Machine Learning, 2021.](https://mlanthology.org/icml/2021/yi2021icml-improved/)

BibTeX

@inproceedings{yi2021icml-improved,
  title     = {{Improved OOD Generalization via Adversarial Training and Pretraing}},
  author    = {Yi, Mingyang and Hou, Lu and Sun, Jiacheng and Shang, Lifeng and Jiang, Xin and Liu, Qun and Ma, Zhiming},
  booktitle = {International Conference on Machine Learning},
  year      = {2021},
  pages     = {11987-11997},
  volume    = {139},
  url       = {https://mlanthology.org/icml/2021/yi2021icml-improved/}
}