LoBAM: LoRA-Based Backdoor Attack on Model Merging
Abstract
Model merging is an emerging technique that integrates multiple models fine-tuned on different tasks to create a versatile model that excels in multiple domains. This scheme, in the meantime, may open up backdoor attack opportunities where one single malicious model can jeopardize the integrity of the merged model. Existing works try to demonstrate the risk of such attacks by assuming substantial computational resources, focusing on cases where the attacker can fully fine-tune the pre-trained model. Such an assumption, however, may not be feasible given the increasing size of machine learning models. In practice where resources are limited and the attacker can only employ techniques like Low-Rank Adaptation (LoRA) to produce the malicious model, it remains unclear whether the attack can still work and pose threats. In this work, we first identify that the attack efficacy is significantly diminished when using LoRA for fine-tuning. Then, we propose LoBAM, a method that yields high attack success rate with minimal training resources. The key idea of LoBAM is to amplify the malicious weights in an intelligent way that effectively enhances the attack efficacy. We demonstrate that our design can lead to improved attack success rate through extensive empirical experiments across various model merging scenarios. Moreover, we show that our method has strong stealthiness and is difficult to detect.
Cite
Text
Yin et al. "LoBAM: LoRA-Based Backdoor Attack on Model Merging." ICLR 2025 Workshops: Data_Problems, 2025.Markdown
[Yin et al. "LoBAM: LoRA-Based Backdoor Attack on Model Merging." ICLR 2025 Workshops: Data_Problems, 2025.](https://mlanthology.org/iclrw/2025/yin2025iclrw-lobam/)BibTeX
@inproceedings{yin2025iclrw-lobam,
title = {{LoBAM: LoRA-Based Backdoor Attack on Model Merging}},
author = {Yin, Ming and Zhang, Jingyang and Sun, Jingwei and Fang, Minghong and Li, Hai Helen and Chen, Yiran},
booktitle = {ICLR 2025 Workshops: Data_Problems},
year = {2025},
url = {https://mlanthology.org/iclrw/2025/yin2025iclrw-lobam/}
}