Initialization Matters: Unraveling the Impact of Pre-Training on Federated Learning

Abstract

Initializing with pre-trained models when learning on downstream tasks is becoming standard practice in machine learning. Several recent works explore the benefits of pre-trained initialization in a federated learning (FL) setting, where the downstream training is performed at the edge clients with heterogeneous data distribution. These works show that starting from a pre-trained model can substantially reduce the adverse impact of data heterogeneity on the test performance of a model trained in a federated setting, with no changes to the standard FedAvg training algorithm. In this work, we provide a deeper theoretical understanding of this phenomenon. To do so, we study the class of two-layer convolutional neural networks (CNNs) and provide bounds on the training error convergence and test error of such a network trained with FedAvg. We introduce the notion of aligned and misaligned filters at initialization and show that the data heterogeneity only affects learning on misaligned filters. Starting with a pre-trained model typically results in fewer misaligned filters at initialization, thus producing a lower test error even when the model is trained in a federated setting with data heterogeneity. Experiments in synthetic settings and practical FL training on CNNs verify our theoretical findings.

Cite

Text

Jhunjhunwala et al. "Initialization Matters: Unraveling the Impact of Pre-Training on Federated Learning." Transactions on Machine Learning Research, 2025.

Markdown

[Jhunjhunwala et al. "Initialization Matters: Unraveling the Impact of Pre-Training on Federated Learning." Transactions on Machine Learning Research, 2025.](https://mlanthology.org/tmlr/2025/jhunjhunwala2025tmlr-initialization/)

BibTeX

@article{jhunjhunwala2025tmlr-initialization,
  title     = {{Initialization Matters: Unraveling the Impact of Pre-Training on Federated Learning}},
  author    = {Jhunjhunwala, Divyansh and Sharma, Pranay and Xu, Zheng and Joshi, Gauri},
  journal   = {Transactions on Machine Learning Research},
  year      = {2025},
  url       = {https://mlanthology.org/tmlr/2025/jhunjhunwala2025tmlr-initialization/}
}