On Traceability in $\ell_p$ Stochastic Convex Optimization

Abstract

In this paper, we investigate the necessity of traceability for accurate learning in stochastic convex optimization (SCO) under $\ell_p$ geometries. Informally, we say a learning algorithm is \emph{$m$-traceable} if, by analyzing its output, it is possible to identify at least $m$ of its training samples. Our main results uncover a fundamental tradeoff between traceability and excess risk in SCO. For every $p\in [1,\infty)$, we establish the existence of an excess risk threshold below which every sample-efficient learner is traceable with the number of samples which is a \emph{constant fraction} of its training sample. For $p\in [1,2]$, this threshold coincides with the best excess risk of differentially private (DP) algorithms, i.e., above this threshold, there exist algorithms that are not traceable, which corresponds to a sharp phase transition. For $p \in (2,\infty)$, this threshold instead gives novel lower bounds for DP learning, partially closing an open problem in this setup. En route to establishing these results, we prove a sparse variant of the fingerprinting lemma, which is of independent interest to the community.

Cite

Text

Voitovych et al. "On Traceability in $\ell_p$ Stochastic Convex Optimization." Advances in Neural Information Processing Systems, 2025.

Markdown

[Voitovych et al. "On Traceability in $\ell_p$ Stochastic Convex Optimization." Advances in Neural Information Processing Systems, 2025.](https://mlanthology.org/neurips/2025/voitovych2025neurips-traceability/)

BibTeX

@inproceedings{voitovych2025neurips-traceability,
  title     = {{On Traceability in $\ell_p$ Stochastic Convex Optimization}},
  author    = {Voitovych, Sasha and Haghifam, Mahdi and Attias, Idan and Dziugaite, Gintare Karolina and Livni, Roi and Roy, Daniel M.},
  booktitle = {Advances in Neural Information Processing Systems},
  year      = {2025},
  url       = {https://mlanthology.org/neurips/2025/voitovych2025neurips-traceability/}
}