Understanding Permutation Based Model Merging with Feature Visualizations

Abstract

Linear mode connectivity (LMC) has become a topic of great interest in recent years. It has been empirically demonstrated that popular deep learning models trained from different initializations exhibit linear model connectivity up to permutation. Based on this, several approaches for finding a permutation of the model's features or weights have been proposed leading to several popular methods for model merging. These methods enable the simple averaging of two models to create a new high-performance model. However, besides accuracy, the properties of these models and their relationships to the representations of the models they derive from are poorly understood. In this work, we study the inner mechanisms behind LMC in model merging through the lens of classic feature visualization methods. Focusing on convolutional neural networks (CNNs) we make several observations that shed light on the underlying mechanisms of model merging by permute and average.

PDF NeurIPSW OpenReview Semantic Scholar

Cite

Text

Zou et al. "Understanding Permutation Based Model Merging with Feature Visualizations." NeurIPS 2024 Workshops: UniReps, 2024.

Markdown

[Zou et al. "Understanding Permutation Based Model Merging with Feature Visualizations." NeurIPS 2024 Workshops: UniReps, 2024.](https://mlanthology.org/neuripsw/2024/zou2024neuripsw-understanding/)

BibTeX

@inproceedings{zou2024neuripsw-understanding,
  title     = {{Understanding Permutation Based Model Merging with Feature Visualizations}},
  author    = {Zou, Congshu and Nanfack, Geraldin and Horoi, Stefan and Belilovsky, Eugene},
  booktitle = {NeurIPS 2024 Workshops: UniReps},
  year      = {2024},
  url       = {https://mlanthology.org/neuripsw/2024/zou2024neuripsw-understanding/}
}