Convolution Meets LoRA: Parameter Efficient Finetuning for Segment Anything Model
Abstract
The Segment-Anything Model (SAM) stands as a foundational framework for image segmentation. While it exhibits remarkable zero-shot generalization in typical scenarios, its advantage diminishes when applied to specialized domains like medical imagery and remote sensing. To address this limitation, this paper introduces Conv-LoRA, a simple yet effective parameter-efficient fine-tuning approach. By integrating ultra-lightweight convolutional parameters into Low-Rank Adaptation (LoRA), Conv-LoRA can inject image-related inductive biases into the plain ViT encoder, further reinforcing SAM’s local prior assumption. Notably, Conv-LoRA not only preserves SAM’s extensive segmentation knowledge but also revives its capacity of learning high-level image semantics, which is constrained by SAM’s foreground-background segmentation pretraining. Comprehensive experimentation across diverse benchmarks spanning multiple domains underscores Conv-LoRA’s superiority in adapting SAM to real-world semantic segmentation tasks.
Cite
Text
Zhong et al. "Convolution Meets LoRA: Parameter Efficient Finetuning for Segment Anything Model." International Conference on Learning Representations, 2024.Markdown
[Zhong et al. "Convolution Meets LoRA: Parameter Efficient Finetuning for Segment Anything Model." International Conference on Learning Representations, 2024.](https://mlanthology.org/iclr/2024/zhong2024iclr-convolution/)BibTeX
@inproceedings{zhong2024iclr-convolution,
title = {{Convolution Meets LoRA: Parameter Efficient Finetuning for Segment Anything Model}},
author = {Zhong, Zihan and Tang, Zhiqiang and He, Tong and Fang, Haoyang and Yuan, Chun},
booktitle = {International Conference on Learning Representations},
year = {2024},
url = {https://mlanthology.org/iclr/2024/zhong2024iclr-convolution/}
}