Secure Dataset Condensation for Privacy-Preserving and Efficient Vertical Federated Learning

Abstract

This work addresses the dual challenges of enhancing training efficiency and protecting data privacy in Vertical Federated Learning (VFL) through secure synthetic dataset generation. VFL typically involves an active party with labels collaborating with a passive party possessing features of the same set of samples. Traditional VFL methods, however, rely on training with entire datasets of sensitive real data, leading to two primary issues: 1) reduced training efficiency due to large dataset sizes, a concern exacerbated in cryptography-based training methods; and 2) potential privacy leakage at the sample level during training. To mitigate these issues, we introduce the Vertical Federated Dataset Condensation (VFDC) method. VFDC employs a novel mixed protection mechanism, integrating class-wise secure aggregation, differential privacy and repetitive initialization, to securely match the distributions of real and synthetic data. Empirical evaluations on six real-world datasets validate VFDC’s efficacy in generating small synthetic data for VFL, achieving a superior utility-privacy-efficiency trade-off during federated training.

Cite

Text

Gao et al. "Secure Dataset Condensation for Privacy-Preserving and Efficient Vertical Federated Learning." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2024. doi:10.1007/978-3-031-70341-6_13

Markdown

[Gao et al. "Secure Dataset Condensation for Privacy-Preserving and Efficient Vertical Federated Learning." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2024.](https://mlanthology.org/ecmlpkdd/2024/gao2024ecmlpkdd-secure/) doi:10.1007/978-3-031-70341-6_13

BibTeX

@inproceedings{gao2024ecmlpkdd-secure,
  title     = {{Secure Dataset Condensation for Privacy-Preserving and Efficient Vertical Federated Learning}},
  author    = {Gao, Dashan and Wu, Canhui and Zhang, Xiaojin and Yao, Xin and Yang, Qiang},
  booktitle = {European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases},
  year      = {2024},
  pages     = {212-229},
  doi       = {10.1007/978-3-031-70341-6_13},
  url       = {https://mlanthology.org/ecmlpkdd/2024/gao2024ecmlpkdd-secure/}
}