Adversarial Training with Synthesized Data: A Path to Robust and Generalizable Neural Networks

Abstract

Adversarial Training (AT) is a well-known framework designed to mitigate adversarial vulnerabilities in neural networks. Recent research indicates that incorporating adversarial examples (AEs) in training can enhance models' generalization capabilities. To understand the impact of AEs on learning dynamics, we study AT through the lens of sample difficulty methodologies. Our findings show that AT leads to more stable learning dynamics compared to Natural Training (NT), resulting in gradual performance improvements and less overconfident predictions. This suggests that AT steers training away from learning easy, perturbable spurious features toward more resilient and generalizable ones. However, a trade-off exists between adversarial robustness and generalization gains, due to robust overfitting, limiting practical deployment. To address this, we propose using synthesized data to bridge this gap. Our results demonstrate that AT benefits significantly from synthesized data, whereas NT does not, enhancing generalization without compromising robustness and offering new avenues for developing robust and generalizable models.

Cite

Text

Bayat and Rish. "Adversarial Training with Synthesized Data: A Path to Robust and Generalizable Neural Networks." ICML 2024 Workshops: NextGenAISafety, 2024.

Markdown

[Bayat and Rish. "Adversarial Training with Synthesized Data: A Path to Robust and Generalizable Neural Networks." ICML 2024 Workshops: NextGenAISafety, 2024.](https://mlanthology.org/icmlw/2024/bayat2024icmlw-adversarial/)

BibTeX

@inproceedings{bayat2024icmlw-adversarial,
  title     = {{Adversarial Training with Synthesized Data: A Path to Robust and Generalizable Neural Networks}},
  author    = {Bayat, Reza and Rish, Irina},
  booktitle = {ICML 2024 Workshops: NextGenAISafety},
  year      = {2024},
  url       = {https://mlanthology.org/icmlw/2024/bayat2024icmlw-adversarial/}
}