Spanning Attack: Reinforce Black-Box Attacks with Unlabeled Data
Abstract
Adversarial black-box attacks aim to craft adversarial perturbations by querying input–output pairs of machine learning models. They are widely used to evaluate the robustness of pre-trained models. However, black-box attacks often suffer from the issue of query inefficiency due to the high dimensionality of the input space, and therefore incur a false sense of model robustness. In this paper, we relax the conditions of the black-box threat model, and propose a novel technique called the spanning attack. By constraining adversarial perturbations in a low-dimensional subspace via spanning an auxiliary unlabeled dataset, the spanning attack significantly improves the query efficiency of a wide variety of existing black-box attacks. Extensive experiments show that the proposed method works favorably in both soft-label and hard-label black-box attacks.
Cite
Text
Wang et al. "Spanning Attack: Reinforce Black-Box Attacks with Unlabeled Data." Machine Learning, 2020. doi:10.1007/S10994-020-05916-1Markdown
[Wang et al. "Spanning Attack: Reinforce Black-Box Attacks with Unlabeled Data." Machine Learning, 2020.](https://mlanthology.org/mlj/2020/wang2020mlj-spanning/) doi:10.1007/S10994-020-05916-1BibTeX
@article{wang2020mlj-spanning,
title = {{Spanning Attack: Reinforce Black-Box Attacks with Unlabeled Data}},
author = {Wang, Lu and Zhang, Huan and Yi, Jinfeng and Hsieh, Cho-Jui and Jiang, Yuan},
journal = {Machine Learning},
year = {2020},
pages = {2349-2368},
doi = {10.1007/S10994-020-05916-1},
volume = {109},
url = {https://mlanthology.org/mlj/2020/wang2020mlj-spanning/}
}