Spanning Attack: Reinforce Black-Box Attacks with Unlabeled Data

Abstract

Adversarial black-box attacks aim to craft adversarial perturbations by querying input–output pairs of machine learning models. They are widely used to evaluate the robustness of pre-trained models. However, black-box attacks often suffer from the issue of query inefficiency due to the high dimensionality of the input space, and therefore incur a false sense of model robustness. In this paper, we relax the conditions of the black-box threat model, and propose a novel technique called the spanning attack. By constraining adversarial perturbations in a low-dimensional subspace via spanning an auxiliary unlabeled dataset, the spanning attack significantly improves the query efficiency of a wide variety of existing black-box attacks. Extensive experiments show that the proposed method works favorably in both soft-label and hard-label black-box attacks.

Cite

Text

Wang et al. "Spanning Attack: Reinforce Black-Box Attacks with Unlabeled Data." Machine Learning, 2020. doi:10.1007/S10994-020-05916-1

Markdown

[Wang et al. "Spanning Attack: Reinforce Black-Box Attacks with Unlabeled Data." Machine Learning, 2020.](https://mlanthology.org/mlj/2020/wang2020mlj-spanning/) doi:10.1007/S10994-020-05916-1

BibTeX

@article{wang2020mlj-spanning,
  title     = {{Spanning Attack: Reinforce Black-Box Attacks with Unlabeled Data}},
  author    = {Wang, Lu and Zhang, Huan and Yi, Jinfeng and Hsieh, Cho-Jui and Jiang, Yuan},
  journal   = {Machine Learning},
  year      = {2020},
  pages     = {2349-2368},
  doi       = {10.1007/S10994-020-05916-1},
  volume    = {109},
  url       = {https://mlanthology.org/mlj/2020/wang2020mlj-spanning/}
}