Improving Model Merging with Natural Niches

Abstract

Model merging is a powerful technique to combine specialized knowledge of multiple machine learning models into a single unified model. However, current methods require manually partitioning the model parameters into a fixed number of groups to be merged, which constraints the exploration of potential combinations and limits performance. To address these limitations, we propose an evolutionary algorithm with three key features: (1) dynamically adjustment of merging boundaries to progressively explore a broader range of parameter combinations; (2) a diversity preservation mechanism inspired by nature, which maintains a population of diverse, high-performing models that are particularly effective for merging; and (3) a heuristic-based \textit{mate selection} strategy to identify the most promising pairs of models for merging. Our experimental results show, for the first time, that model merging can be used to evolve models from \textit{scratch}. Specifically, we evolve MNIST classifiers from scratch using our method, and achieve comparable performance to CMA-ES, while being computationally cheaper. Additionally, we use our method to merge specialised language models and obtain state-of-the-art performance. Our code is available at https://github.com/AnonScientist/natural_niches.

Cite

Text

Abrantes et al. "Improving Model Merging with Natural Niches." NeurIPS 2024 Workshops: UniReps, 2024.

Markdown

[Abrantes et al. "Improving Model Merging with Natural Niches." NeurIPS 2024 Workshops: UniReps, 2024.](https://mlanthology.org/neuripsw/2024/abrantes2024neuripsw-improving-a/)

BibTeX

@inproceedings{abrantes2024neuripsw-improving-a,
  title     = {{Improving Model Merging with Natural Niches}},
  author    = {Abrantes, João and Lange, Robert Tjarko and Tang, Yujin},
  booktitle = {NeurIPS 2024 Workshops: UniReps},
  year      = {2024},
  url       = {https://mlanthology.org/neuripsw/2024/abrantes2024neuripsw-improving-a/}
}