AI Security in the Foundation Model Era: A Comprehensive Survey from a Unified Perspective

Abstract

As machine learning (ML) systems expand in both scale and functionality, the security landscape has become increasingly complex, with a proliferation of attacks and defenses.However, existing studies largely treat these threats in isolation, lacking a coherent framework to expose their shared principles and inter-dependencies. This fragmented view hinders systematic understanding and limits the design of comprehensive defenses. Crucially, the two foundational assets of ML—data and models—are no longer independent; vulnerabilities in one directly compromise the other. The absence of a holistic framework leaves open questions about how these bidirectional risks propagate across the ML pipeline. To address this critical gap, we propose a unified closed-loop threat taxonomy that explicitly frames model–data interactions along four directional axes. Our framework offers a principled lens for analyzing and defending foundation models. The resulting four classes of security threats represent distinct but interrelated categories of attacks: (1) Data→Data (D→D): including data decryption attacks and watermark removal attacks. (2) Data→Model (D→M): including poisoning, harmful fine-tuning attacks and jailbreak attacks; (3) Model→Data (M→D): including model inversion, membership inference attacks, and training data extraction attacks; (4) Model→Model (M→M): including model extraction attacks. We conduct a systematic review that analyzes the mathematical formulations, attack and defense strategies, and applications across the vision, language, audio, and graph domains. Our unified framework elucidates the underlying connections among these security threats and establishes a foundation for developing scalable, transferable, and cross-modal security strategies, particularly within the landscape of foundation models.

Cite

Text

Wang and Luan. "AI Security in the Foundation Model Era: A Comprehensive Survey from a Unified Perspective." Transactions on Machine Learning Research, 2026.

Markdown

[Wang and Luan. "AI Security in the Foundation Model Era: A Comprehensive Survey from a Unified Perspective." Transactions on Machine Learning Research, 2026.](https://mlanthology.org/tmlr/2026/wang2026tmlr-ai/)

BibTeX

@article{wang2026tmlr-ai,
  title     = {{AI Security in the Foundation Model Era: A Comprehensive Survey from a Unified Perspective}},
  author    = {Wang, Zhenyi and Luan, Siyu},
  journal   = {Transactions on Machine Learning Research},
  year      = {2026},
  url       = {https://mlanthology.org/tmlr/2026/wang2026tmlr-ai/}
}