Triplets Oversampling for Class Imbalanced Federated Datasets
Abstract
Class imbalance is a pervasive problem in machine learning, leading to poor performance in the minority class that is inadequately represented. Federated learning, which trains a shared model collaboratively among multiple clients with their data locally for privacy protection, is also susceptible to class imbalance. The distributed structure and privacy rules in federated learning introduce extra complexities to the challenge of isolated, small, and highly skewed datasets. While sampling and ensemble learning are state-of-the-art techniques for mitigating class imbalance from the data and algorithm perspectives, they face limitations in the context of federated learning. To address this challenge, we propose a novel oversampling algorithm called "Triplets" that generates synthetic samples for both minority and majority classes based on their shared classification boundary. The proposed algorithm captures new minority samples by leveraging three triplets around the boundary, where two come from the majority class and one from the minority class. This approach offers several advantages over existing oversampling techniques on federated datasets. We evaluate the effectiveness of our proposed algorithm through extensive experiments using various real-world datasets and different models in both centralized and federated learning environments. Our results demonstrate the effectiveness of our proposed algorithm, which outperforms existing oversampling techniques. In conclusion, our proposed algorithm offers a promising solution to the class imbalance problem in federated learning. The source code is released at github.com/Xiao-Chenguang/Triplets-Oversampling .
Cite
Text
Xiao and Wang. "Triplets Oversampling for Class Imbalanced Federated Datasets." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2023. doi:10.1007/978-3-031-43415-0_22Markdown
[Xiao and Wang. "Triplets Oversampling for Class Imbalanced Federated Datasets." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2023.](https://mlanthology.org/ecmlpkdd/2023/xiao2023ecmlpkdd-triplets/) doi:10.1007/978-3-031-43415-0_22BibTeX
@inproceedings{xiao2023ecmlpkdd-triplets,
title = {{Triplets Oversampling for Class Imbalanced Federated Datasets}},
author = {Xiao, Chenguang and Wang, Shuo},
booktitle = {European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases},
year = {2023},
pages = {368-383},
doi = {10.1007/978-3-031-43415-0_22},
url = {https://mlanthology.org/ecmlpkdd/2023/xiao2023ecmlpkdd-triplets/}
}