Augmented Mixup Procedure for Privacy-Preserving Collaborative Training

Abstract

Mixup involves training neural networks on convex combinations of input samples and labels and has been adapted for privacy-preserving collaborative training, most notably in InstaHide. However, mixing-based obfuscation schemes create structured linear systems that can be exploited to reconstruct the underlying private data. We propose a singularized mixup procedure that injects controlled perturbations prior to forming convex combinations, rendering the resulting inverse problem ill-conditioned while preserving discriminative structure. We provide an average-case theoretical analysis that characterizes the security--utility trade-off via minimax reconstruction bounds and directional signal-to-noise ratio control. Empirically, we evaluate classification accuracy on MNIST, CIFAR-10, CIFAR-100, and Tiny-ImageNet, and compare against InstaHide, observing competitive or improved accuracy under strong privacy settings. We assess robustness against both linear and nonlinear reconstruction attacks, including at-scale linear inversion experiments on CIFAR-5M. In a collaborative training setting with multiple parties and heterogeneous data partitions, we further compare against standard federated learning (FedProx), showing that singularized mixup enables accurate centralized training without iterative gradient exchange and yields improved robustness and performance in heterogeneous regimes. Overall, our results demonstrate that singularized mixup substantially degrades reconstruction quality while maintaining strong predictive performance, providing a practical and scalable approach to privacy-preserving collaborative learning.

Cite

Text

Pleșa et al. "Augmented Mixup Procedure for Privacy-Preserving Collaborative Training." Transactions on Machine Learning Research, 2026.

Markdown

[Pleșa et al. "Augmented Mixup Procedure for Privacy-Preserving Collaborative Training." Transactions on Machine Learning Research, 2026.](https://mlanthology.org/tmlr/2026/plesa2026tmlr-augmented/)

BibTeX

@article{plesa2026tmlr-augmented,
  title     = {{Augmented Mixup Procedure for Privacy-Preserving Collaborative Training}},
  author    = {Pleșa, Mihail-Iulian and Clérot, Fabrice and David, Simona Elena and Poenaru, Robert},
  journal   = {Transactions on Machine Learning Research},
  year      = {2026},
  url       = {https://mlanthology.org/tmlr/2026/plesa2026tmlr-augmented/}
}