Adversarial Training with Generated Data in High-Dimensional Regression: An Asymptotic Study

Abstract

In recent years, studies such as \cite{carmon2019unlabeled,gowal2021improving,xing2022artificial} have demonstrated that incorporating additional real or generated data with pseudo-labels can enhance adversarial training through a two-stage training approach. In this paper, we perform a theoretical analysis of the asymptotic behavior of this method in high-dimensional linear regression. While a double-descent phenomenon can be observed in ridgeless training, with an appropriate $\mathcal{L}_2$ regularization, the two-stage adversarial training achieves a better performance. Finally, we derive a shortcut cross-validation formula specifically tailored for the two-stage training method.

Cite

Text

Xing. "Adversarial Training with Generated Data in High-Dimensional Regression: An Asymptotic Study." ICML 2023 Workshops: AdvML-Frontiers, 2023.

Markdown

[Xing. "Adversarial Training with Generated Data in High-Dimensional Regression: An Asymptotic Study." ICML 2023 Workshops: AdvML-Frontiers, 2023.](https://mlanthology.org/icmlw/2023/xing2023icmlw-adversarial/)

BibTeX

@inproceedings{xing2023icmlw-adversarial,
  title     = {{Adversarial Training with Generated Data in High-Dimensional Regression: An Asymptotic Study}},
  author    = {Xing, Yue},
  booktitle = {ICML 2023 Workshops: AdvML-Frontiers},
  year      = {2023},
  url       = {https://mlanthology.org/icmlw/2023/xing2023icmlw-adversarial/}
}