Beyond Parameter Averaging in Model Aggregation

Abstract

The success of foundation models is strongly linked to scale, which has reinforced the interest in federated learning. With the prohibitive cost of training a large language model (LLM) in mind, little attention has been placed on reusing pre-trained models in collaborative training settings. Self-supervision has also played an important role in this success, but its emphasis has been primarily on data. This paper leverages Bayesian principles to bring self-supervision into the model aggregation toolbox. It introduces self-supervised Fisher merging, a framework that successfully merges models in parameter space without re-visiting data, opening a new door in model reusability. Experimental results build the foundation of our method on tractable linear models, and highlight its potential on aggregating neural networks.

Cite

Text

Recasens et al. "Beyond Parameter Averaging in Model Aggregation." NeurIPS 2023 Workshops: Federated_Learning, 2023.

Markdown

[Recasens et al. "Beyond Parameter Averaging in Model Aggregation." NeurIPS 2023 Workshops: Federated_Learning, 2023.](https://mlanthology.org/neuripsw/2023/recasens2023neuripsw-beyond/)

BibTeX

@inproceedings{recasens2023neuripsw-beyond,
  title     = {{Beyond Parameter Averaging in Model Aggregation}},
  author    = {Recasens, Pol G. and Torres, Jordi and Berral, Josep Lluis and Hauberg, Søren and Moreno-Muñoz, Pablo},
  booktitle = {NeurIPS 2023 Workshops: Federated_Learning},
  year      = {2023},
  url       = {https://mlanthology.org/neuripsw/2023/recasens2023neuripsw-beyond/}
}