FDAPT: Federated Domain-Adaptive Pre-Training for Language Models

Abstract

Foundation models (FMs) have shown prominent success in a wide range of tasks [Bommasani et al., 2021]. Their applicability to specific domain-task pairings relies on the availability of, both, high-quality data and significant computational resources. These challenges are not new to the field and, indeed, Federated Learning (FL) has been shown to be a promising solution in similar setups [Yu et al., 2023, Zhuang et al., 2023]. This paper tackles the specific case of Domain-adaptive Pre-training (DAPT), a key step in the application of FMs. We conduct the first comprehensive empirical study to evaluate the performance of Federated Domain-adaptive Pre-training (FDAPT). We demonstrate that FDAPT can maintain competitive downstream task performance to the centralized baseline in both IID and non-IID situations. Finally, we propose a novel algorithm, Frozen Federated Domain-adaptive Pre-training (FFDAPT). FFDAPT improves the computational efficiency by 12.1% on average and exhibits similar downstream task performance to vanilla FDAPT, with general performance fluctuations remaining less than 1%.

Cite

Text

Jiang et al. "FDAPT: Federated Domain-Adaptive Pre-Training for Language Models." NeurIPS 2023 Workshops: Federated_Learning, 2023.

Markdown

[Jiang et al. "FDAPT: Federated Domain-Adaptive Pre-Training for Language Models." NeurIPS 2023 Workshops: Federated_Learning, 2023.](https://mlanthology.org/neuripsw/2023/jiang2023neuripsw-fdapt/)

BibTeX

@inproceedings{jiang2023neuripsw-fdapt,
  title     = {{FDAPT: Federated Domain-Adaptive Pre-Training for Language Models}},
  author    = {Jiang, Lekang and Svoboda, Filip and Lane, Nicholas Donald},
  booktitle = {NeurIPS 2023 Workshops: Federated_Learning},
  year      = {2023},
  url       = {https://mlanthology.org/neuripsw/2023/jiang2023neuripsw-fdapt/}
}