Splitting & Integrating: Out-of-Distribution Detection via Adversarial Gradient Attribution

Abstract

Out-of-distribution (OOD) detection is essential for enhancing the robustness and security of deep learning models in unknown and dynamic data environments. Gradient-based OOD detection methods, such as GAIA, analyse the explanation pattern representations of in-distribution (ID) and OOD samples by examining the sensitivity of model outputs w.r.t. model inputs, resulting in superior performance compared to traditional OOD detection methods. However, we argue that the non-zero gradient behaviors of OOD samples do not exhibit significant distinguishability, especially when ID samples are perturbed by random perturbations in high-dimensional spaces, which negatively impacts the accuracy of OOD detection. In this paper, we propose a novel OOD detection method called S & I based on layer Splitting and gradient Integration via Adversarial Gradient Attribution. Specifically, our approach involves splitting the model’s intermediate layers and iteratively updating adversarial examples layer-by-layer. We then integrate the attribution gradients from each intermediate layer along the attribution path from adversarial examples to the actual input, yielding true explanation pattern representations for both ID and OOD samples. Experiments demonstrate that our S & I algorithm achieves state-of-the-art results, with the average FPR95 of 29.05% (ResNet34)/38.61% (WRN40) and 37.31% (BiT-S) on the CIFAR100 and ImageNet benchmarks, respectively. Our code is available at: https://github.com/LMBTough/S-Ihttps://github.com/LMBTough/S-I

Cite

Text

Zhang et al. "Splitting & Integrating: Out-of-Distribution Detection via Adversarial Gradient Attribution." Proceedings of the 42nd International Conference on Machine Learning, 2025.

Markdown

[Zhang et al. "Splitting & Integrating: Out-of-Distribution Detection via Adversarial Gradient Attribution." Proceedings of the 42nd International Conference on Machine Learning, 2025.](https://mlanthology.org/icml/2025/zhang2025icml-splitting/)

BibTeX

@inproceedings{zhang2025icml-splitting,
  title     = {{Splitting & Integrating: Out-of-Distribution Detection via Adversarial Gradient Attribution}},
  author    = {Zhang, Jiayu and Wang, Xinyi and Jin, Zhibo and Zhu, Zhiyu and Zhou, Jianlong and Chen, Fang and Chen, Huaming},
  booktitle = {Proceedings of the 42nd International Conference on Machine Learning},
  year      = {2025},
  pages     = {76213-76224},
  volume    = {267},
  url       = {https://mlanthology.org/icml/2025/zhang2025icml-splitting/}
}