Label-Only Model Inversion Attacks via Boundary Repulsion

Kahla, Mostafa; Chen, Si; Just, Hoang Anh; Jia, Ruoxi

doi:10.1109/CVPR52688.2022.01462

Label-Only Model Inversion Attacks via Boundary Repulsion

Mostafa Kahla, Si Chen, Hoang Anh Just, Ruoxi Jia

CVPR 2022 pp. 15045-15053

doi:10.1109/CVPR52688.2022.01462 /cvpr/2022/kahla2022cvpr-labelonly/

Abstract

Recent studies show that the state-of-the-art deep neural networks are vulnerable to model inversion attacks, in which access to a model is abused to reconstruct private training data of any given target class. Existing attacks rely on having access to either the complete target model(whitebox) or the model's soft-labels (blackbox). However, no prior work has been done in the harder but more practical scenario, in which the attacker only has access to the model's predicted label, without a confidence measure. In this paper, we introduce an algorithm, Boundary-Repelling Model Inversion (BREP-MI), to invert private training data using only the target model's predicted labels. The key idea of our algorithm is to evaluate the model's predicted labels over a sphere and then estimate the direction to reach the target class's centroid. Using the example of face recognition, we show that the images reconstructed by BREP-MI successfully reproduce the semantics of the private training data for various datasets and target model architectures. We compare BREP-MI with the state-of-the-art white-box and blackbox model inversion attacks and the results show that despite assuming less knowledge about the target model, BREP-MI outperforms the blackbox attack and achieves comparable results to the whitebox attack.

PDF CVPR Semantic Scholar

Cite

Text

Kahla et al. "Label-Only Model Inversion Attacks via Boundary Repulsion." Conference on Computer Vision and Pattern Recognition, 2022. doi:10.1109/CVPR52688.2022.01462

Markdown

[Kahla et al. "Label-Only Model Inversion Attacks via Boundary Repulsion." Conference on Computer Vision and Pattern Recognition, 2022.](https://mlanthology.org/cvpr/2022/kahla2022cvpr-labelonly/) doi:10.1109/CVPR52688.2022.01462

BibTeX

@inproceedings{kahla2022cvpr-labelonly,
  title     = {{Label-Only Model Inversion Attacks via Boundary Repulsion}},
  author    = {Kahla, Mostafa and Chen, Si and Just, Hoang Anh and Jia, Ruoxi},
  booktitle = {Conference on Computer Vision and Pattern Recognition},
  year      = {2022},
  pages     = {15045-15053},
  doi       = {10.1109/CVPR52688.2022.01462},
  url       = {https://mlanthology.org/cvpr/2022/kahla2022cvpr-labelonly/}
}