Efficient Heterogeneity-Aware Federated Active Data Selection

Abstract

Federated Active Learning (FAL) aims to learn an effective global model, while minimizing label queries. Owing to privacy requirements, it is challenging to design effective active data selection schemes due to the lack of cross-client query information. In this paper, we bridge this important gap by proposing the Federated Active data selection by LEverage score sampling (FALE) method. It is designed for regression tasks in the presence of non-i.i.d. client data to enable the server to select data globally in a privacy-preserving manner. Based on FedSVD, FALE aims to estimate the utility of unlabeled data and perform data selection via leverage score sampling. Besides, a secure model learning framework is designed for federated regression tasks to exploit supervision. FALE can operate without requiring an initial labeled set and select the instances in a single pass, significantly reducing communication overhead. Theoretical analyze establishes the query complexity for FALE to achieve constant factor approximation and relative error approximation. Extensive experiments on 11 benchmark datasets demonstrate significant improvements of FALE over existing state-of-the-art methods.

Cite

Text

Tang et al. "Efficient Heterogeneity-Aware Federated Active Data Selection." Proceedings of the 42nd International Conference on Machine Learning, 2025.

Markdown

[Tang et al. "Efficient Heterogeneity-Aware Federated Active Data Selection." Proceedings of the 42nd International Conference on Machine Learning, 2025.](https://mlanthology.org/icml/2025/tang2025icml-efficient/)

BibTeX

@inproceedings{tang2025icml-efficient,
  title     = {{Efficient Heterogeneity-Aware Federated Active Data Selection}},
  author    = {Tang, Ying-Peng and Ren, Chao and Tang, Xiaoli and Huang, Sheng-Jun and Cui, Lizhen and Yu, Han},
  booktitle = {Proceedings of the 42nd International Conference on Machine Learning},
  year      = {2025},
  pages     = {58931-58943},
  volume    = {267},
  url       = {https://mlanthology.org/icml/2025/tang2025icml-efficient/}
}