AdaDARE-Gamma: Balancing Stability and Plasticity in Multi-Modal LLMs Through Efficient Adaptation

Abstract

Adapting Multi-modal Large Language Models (MLLMs) to target tasks often suffers from catastrophic forgetting, where acquiring new task-specific knowledge compromises performance on pre-trained tasks. In this paper, we introduce AdaDARE-\gamma, an efficient approach that alleviates catastrophic forgetting by controllably injecting new task-specific knowledge through adaptive parameter selection from fine-tuned models without requiring retraining procedures. This approach consists two key innovations: (1) an adaptive parameter selection mechanism that identifies and retains the most task-relevant parameters from fine-tuned models, and (2) a controlled task-specific information injection strategy that precisely balances the preservation of pre-trained knowledge with the acquisition of new capabilities. Theoretical analysis proves the optimality of our parameter selection strategy and establishes bounds for the task-specific information injection factor. Extensive experiments on InstructBLIP and LLaVA-1.5 across image captioning and visual question answering tasks demonstrate that AdaDARE-\gamma establishes new state-of-the-art results in balancing model performance. Specifically, it maintains 98.2% of pre-training effectiveness on original tasks while achieving 98.7% of standard fine-tuning performance on target tasks.

Cite

Text

Xie et al. "AdaDARE-Gamma: Balancing Stability and Plasticity in Multi-Modal LLMs Through Efficient Adaptation." Conference on Computer Vision and Pattern Recognition, 2025. doi:10.1109/CVPR52734.2025.01840

Markdown

[Xie et al. "AdaDARE-Gamma: Balancing Stability and Plasticity in Multi-Modal LLMs Through Efficient Adaptation." Conference on Computer Vision and Pattern Recognition, 2025.](https://mlanthology.org/cvpr/2025/xie2025cvpr-adadaregamma/) doi:10.1109/CVPR52734.2025.01840

BibTeX

@inproceedings{xie2025cvpr-adadaregamma,
  title     = {{AdaDARE-Gamma: Balancing Stability and Plasticity in Multi-Modal LLMs Through Efficient Adaptation}},
  author    = {Xie, Jingyi and Yang, Jintao and Luo, Zhunchen and Cao, Yunbo and Gao, Qiang and Zhang, Mengyuan and Hu, Wenpeng},
  booktitle = {Conference on Computer Vision and Pattern Recognition},
  year      = {2025},
  pages     = {19758-19768},
  doi       = {10.1109/CVPR52734.2025.01840},
  url       = {https://mlanthology.org/cvpr/2025/xie2025cvpr-adadaregamma/}
}