Little Is Enough: Boosting Privacy by Sharing Only Hard Labels in Federated Semi-Supervised Learning

Abstract

In many critical applications, sensitive data is inherently distributed and cannot be centralized due to privacy concerns. A wide range of federated learning approaches have been proposed to train models locally at each client without sharing their sensitive data, typically by exchanging model parameters, or probabilistic predictions (soft labels) on a public dataset or a combination of both. However, these methods still disclose private information and restrict local models to those that can be trained using gradient-based methods. We propose a federated co-training (FEDCT) approach that improves privacy by sharing only definitive (hard) labels on a public unlabeled dataset. Clients use a consensus of these shared labels as pseudo-labels for local training. This federated co-training approach empirically enhances privacy without compromising model quality. In addition, it allows the use of local models that are not suitable for parameter aggregation in traditional federated learning, such as gradient-boosted decision trees, rule ensembles, and random forests. Furthermore, we observe that FEDCT performs effectively in federated fine-tuning of large language models, where its pseudo-labeling mechanism is particularly beneficial. Empirical evaluations and theoretical analyses suggest its applicability across a range of federated learning scenarios.

Cite

Text

Abourayya et al. "Little Is Enough: Boosting Privacy by Sharing Only Hard Labels in Federated Semi-Supervised Learning." AAAI Conference on Artificial Intelligence, 2025. doi:10.1609/AAAI.V39I15.33678

Markdown

[Abourayya et al. "Little Is Enough: Boosting Privacy by Sharing Only Hard Labels in Federated Semi-Supervised Learning." AAAI Conference on Artificial Intelligence, 2025.](https://mlanthology.org/aaai/2025/abourayya2025aaai-little/) doi:10.1609/AAAI.V39I15.33678

BibTeX

@inproceedings{abourayya2025aaai-little,
  title     = {{Little Is Enough: Boosting Privacy by Sharing Only Hard Labels in Federated Semi-Supervised Learning}},
  author    = {Abourayya, Amr and Kleesiek, Jens and Rao, Kanishka and Ayday, Erman and Rao, Bharat and Webb, Geoffrey I. and Kamp, Michael},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {2025},
  pages     = {15293-15301},
  doi       = {10.1609/AAAI.V39I15.33678},
  url       = {https://mlanthology.org/aaai/2025/abourayya2025aaai-little/}
}