On InstaHide, Phase Retrieval, and Sparse Matrix Factorization
Abstract
In this work, we examine the security of InstaHide, a scheme recently proposed by \cite{hsla20} for preserving the security of private datasets in the context of distributed learning. To generate a synthetic training example to be shared among the distributed learners, InstaHide takes a convex combination of private feature vectors and randomly flips the sign of each entry of the resulting vector with probability 1/2. A salient question is whether this scheme is secure in any provable sense, perhaps under a plausible complexity-theoretic assumption. The answer to this turns out to be quite subtle and closely related to the average-case complexity of a multi-task, missing-data version of the classic problem of phase retrieval that is interesting in its own right. Motivated by this connection, under the standard distributional assumption that the public/private feature vectors are isotropic Gaussian, we design an algorithm that can actually recover a private vector using only the public vectors and a sequence of synthetic vectors generated by InstaHide.
Cite
Text
Chen et al. "On InstaHide, Phase Retrieval, and Sparse Matrix Factorization." International Conference on Learning Representations, 2021.Markdown
[Chen et al. "On InstaHide, Phase Retrieval, and Sparse Matrix Factorization." International Conference on Learning Representations, 2021.](https://mlanthology.org/iclr/2021/chen2021iclr-instahide/)BibTeX
@inproceedings{chen2021iclr-instahide,
title = {{On InstaHide, Phase Retrieval, and Sparse Matrix Factorization}},
author = {Chen, Sitan and Li, Xiaoxiao and Song, Zhao and Zhuo, Danyang},
booktitle = {International Conference on Learning Representations},
year = {2021},
url = {https://mlanthology.org/iclr/2021/chen2021iclr-instahide/}
}