Model Stock: All We Need Is Just a Few Fine-Tuned Models

Abstract

This paper introduces an efficient fine-tuning method for large pre-trained models, offering strong in-distribution (ID) and out-of-distribution (OOD) performance. Breaking away from traditional practices that need a multitude of fine-tuned models for averaging, our approach employs significantly fewer models to achieve final weights yet yield superior accuracy. Drawing from key insights in the weight space of fine-tuned weights, we uncover a strong link between the performance and proximity to the center of weight space. Based on this, we introduce a method that approximates a center-close weight using only two fine-tuned models, applicable during or after training. Our innovative layer-wise weight averaging technique surpasses state-of-the-art model methods such as Model Soup, utilizing only two fine-tuned models. This strategy can be aptly coined , highlighting its reliance on selecting a minimal number of models to draw a more optimized-averaged model. We demonstrate the efficacy of with fine-tuned models based upon pre-trained CLIP architectures, achieving remarkable performance on both ID and OOD tasks on the standard benchmarks, all while barely bringing extra computational demands. Our code and pre-trained models are available at https://github.com/naver-ai/model-stock.

Cite

Text

Jang et al. "Model Stock: All We Need Is Just a Few Fine-Tuned Models." Proceedings of the European Conference on Computer Vision (ECCV), 2024. doi:10.1007/978-3-031-72784-9_12

Markdown

[Jang et al. "Model Stock: All We Need Is Just a Few Fine-Tuned Models." Proceedings of the European Conference on Computer Vision (ECCV), 2024.](https://mlanthology.org/eccv/2024/jang2024eccv-model/) doi:10.1007/978-3-031-72784-9_12

BibTeX

@inproceedings{jang2024eccv-model,
  title     = {{Model Stock: All We Need Is Just a Few Fine-Tuned Models}},
  author    = {Jang, Dong-Hwan and Yun, Sangdoo and Han, Dongyoon},
  booktitle = {Proceedings of the European Conference on Computer Vision (ECCV)},
  year      = {2024},
  doi       = {10.1007/978-3-031-72784-9_12},
  url       = {https://mlanthology.org/eccv/2024/jang2024eccv-model/}
}