$\texttt{Model-GLUE}$: Democratized LLM Scaling for a Large Model Zoo in the Wild

Abstract

As Large Language Models (LLMs) excel across tasks and specialized domains, scaling LLMs based on existing models has gained significant attention, which is challenged by potential performance drop when combining disparate models. Various techniques have been proposed to aggregate pre-trained LLMs, including model merging, Mixture-of-Experts, and stacking. Despite their merits, a comprehensive comparison and synergistic application of them to a diverse model zoo is yet to be adequately addressed.In light of this research gap, this paper introduces $\texttt{Model-GLUE}$, a holistic LLM scaling guideline. First, our work starts with a benchmarking of existing LLM scaling techniques, especially selective merging, and variants of mixture. Utilizing the insights from the benchmark results, we formulate a strategy for the selection and aggregation of a heterogeneous model zoo characterizing different architectures and initialization.Our methodology involves clustering mergeable models, selecting a merging strategy, and integrating model clusters through model-level mixture. Finally, evidenced by our experiments on a diverse Llama-2-based model zoo, $\texttt{Model-GLUE}$ shows an average performance enhancement of 5.61\%, achieved without additional training.Codes are available at https://github.com/Model-GLUE/Model-GLUE.

Cite

Text

Zhao et al. "$\texttt{Model-GLUE}$: Democratized LLM Scaling for a Large Model Zoo in the Wild." Neural Information Processing Systems, 2024. doi:10.52202/079017-0426

Markdown

[Zhao et al. "$\texttt{Model-GLUE}$: Democratized LLM Scaling for a Large Model Zoo in the Wild." Neural Information Processing Systems, 2024.](https://mlanthology.org/neurips/2024/zhao2024neurips-modelglue/) doi:10.52202/079017-0426

BibTeX

@inproceedings{zhao2024neurips-modelglue,
  title     = {{$\texttt{Model-GLUE}$: Democratized LLM Scaling for a Large Model Zoo in the Wild}},
  author    = {Zhao, Xinyu and Sun, Guoheng and Cai, Ruisi and Zhou, Yukun and Li, Pingzhi and Wang, Peihao and Tan, Bowen and He, Yexiao and Chen, Li and Liang, Yi and Chen, Beidi and Yuan, Binhang and Wang, Hongyi and Li, Ang and Wang, Zhangyang and Chen, Tianlong},
  booktitle = {Neural Information Processing Systems},
  year      = {2024},
  doi       = {10.52202/079017-0426},
  url       = {https://mlanthology.org/neurips/2024/zhao2024neurips-modelglue/}
}