Differentially Private Sparse Linear Regression with Heavy-Tailed Responses
Abstract
As a fundamental problem in machine learning and differential privacy (DP), DP linear regression has been extensively studied. However, most existing methods focus primarily on either regular data distributions or low-dimensional cases with irregular data. To address these limitations, this paper provides a comprehensive study of DP sparse linear regression with heavy-tailed responses in high-dimensional settings. In the first part, we introduce the DP-IHT-H method, which leverages the Huber loss and private iterative hard thresholding to achieve an estimation error bound of $ \tilde{O}\biggl ( s^{* \frac{1 }{2}} \cdot \biggl (\frac{\log d}{n}\biggr )^{\frac{\zeta }{1 + \zeta }} + s^{* \frac{1 + 2\zeta }{2 + 2\zeta }} \cdot \biggl (\frac{\log ^2 d}{n \varepsilon }\biggr )^{\frac{\zeta }{1 + \zeta }} \biggr ) $ O ~ ( s ∗ 1 2 · ( log d n ) ζ 1 + ζ + s ∗ 1 + 2 ζ 2 + 2 ζ · ( log 2 d n ε ) ζ 1 + ζ ) under the $(\varepsilon , \delta )$ ( ε , δ ) -DP model, where n is the sample size, d is the dimensionality, $s^*$ s ∗ is the sparsity of the parameter, and $\zeta \in (0, 1]$ ζ ∈ ( 0 , 1 ] characterizes the tail heaviness of the data. In the second part, we propose DP-IHT-L, which further improves the error bound under additional assumptions on the response and achieves $ \tilde{O}\Bigl (\frac{(s^*)^{3/2} \log d}{n \varepsilon }\Bigr ). $ O ~ ( ( s ∗ ) 3 / 2 log d n ε ) . Compared to the first result, this bound is independent of the tail parameter $\zeta $ ζ . Finally, through experiments on synthetic and real-world datasets, we demonstrate that our methods outperform standard DP algorithms designed for “regular” data.
Cite
Text
Tian et al. "Differentially Private Sparse Linear Regression with Heavy-Tailed Responses." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2025. doi:10.1007/978-3-032-06096-9_21Markdown
[Tian et al. "Differentially Private Sparse Linear Regression with Heavy-Tailed Responses." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2025.](https://mlanthology.org/ecmlpkdd/2025/tian2025ecmlpkdd-differentially/) doi:10.1007/978-3-032-06096-9_21BibTeX
@inproceedings{tian2025ecmlpkdd-differentially,
title = {{Differentially Private Sparse Linear Regression with Heavy-Tailed Responses}},
author = {Tian, Xizhi and Ding, Meng and Tao, Touming and Xiang, Zihang and Wang, Di},
booktitle = {European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases},
year = {2025},
pages = {363-379},
doi = {10.1007/978-3-032-06096-9_21},
url = {https://mlanthology.org/ecmlpkdd/2025/tian2025ecmlpkdd-differentially/}
}