Model Fusion via Optimal Transport

Abstract

Combining different models is a widely used paradigm in machine learning applications. While the most common approach is to form an ensemble of models and average their individual predictions, this approach is often rendered infeasible by given resource constraints in terms of memory and computation, which grow linearly with the number of models. We present a layer-wise model fusion algorithm for neural networks that utilizes optimal transport to (soft-) align neurons across the models before averaging their associated parameters.

Cite

Text

Singh and Jaggi. "Model Fusion via Optimal Transport." Neural Information Processing Systems, 2020.

Markdown

[Singh and Jaggi. "Model Fusion via Optimal Transport." Neural Information Processing Systems, 2020.](https://mlanthology.org/neurips/2020/singh2020neurips-model/)

BibTeX

@inproceedings{singh2020neurips-model,
  title     = {{Model Fusion via Optimal Transport}},
  author    = {Singh, Sidak Pal and Jaggi, Martin},
  booktitle = {Neural Information Processing Systems},
  year      = {2020},
  url       = {https://mlanthology.org/neurips/2020/singh2020neurips-model/}
}