PrivShap: A Finer-Granularity Network Linearization Method for Private Inference

Xiangrui Xu, Zhenzhen Wang, Rui Ning, Chunsheng Xin, Hongyi Wu

TMLR 2025

/tmlr/2025/xu2025tmlr-privshap/

Abstract

Private inference applies cryptographic techniques like homomorphic encryption, garble circuit and secret sharing to keep both sides privacy in a client-server setting during inference. It is often hindered by the high communication overheads, especially at non-linear activation layers such as ReLU. Hence ReLU pruning has been widely recognized as an efficient way to accelerate private inference. Existing approaches to ReLU pruning typically rely on coarse hypothesis, which assume an inverse correlation between the importance of ReLU and linear layers or shallow activation layers have less importance for universal models, to assign the budgets according to the layer while preserving the inference accuracy. However, these assumptions are based on limited empirical evidence and can fail to generalize to diverse model architectures. In this work, we introduce a finer-granularity ReLU budget assignment approach by assessing the layer-wise importance of ReLU with the Shapley value. To address the computational burden of exact Shapley value calculation, we propose a tree-trimming algorithm for fast estimation. We provide both theoretical guarantees and empirical validation of our method. Our extensive experiments show that we achieve better efficiency and accuracy than the state-of-the-art across diverse model architectures, activation functions, and datasets. Specifically, we only need $\sim$2.5\times$ fewer ReLU operations to achieve a similar inference accuracy and gains up to $\sim$8.13\%$ increase on inference accuracy with similar ReLU budgets.

PDF TMLR Semantic Scholar

Cite

Text

Xu et al. "PrivShap: A Finer-Granularity Network Linearization Method for Private Inference." Transactions on Machine Learning Research, 2025.

Markdown

[Xu et al. "PrivShap: A Finer-Granularity Network Linearization Method for Private Inference." Transactions on Machine Learning Research, 2025.](https://mlanthology.org/tmlr/2025/xu2025tmlr-privshap/)

BibTeX

@article{xu2025tmlr-privshap,
  title     = {{PrivShap: A Finer-Granularity Network Linearization Method for Private Inference}},
  author    = {Xu, Xiangrui and Wang, Zhenzhen and Ning, Rui and Xin, Chunsheng and Wu, Hongyi},
  journal   = {Transactions on Machine Learning Research},
  year      = {2025},
  url       = {https://mlanthology.org/tmlr/2025/xu2025tmlr-privshap/}
}