LoR-VP: Low-Rank Visual Prompting for Efficient Vision Model Adaptation
Abstract
Visual prompting has gained popularity as a method for adapting pre-trained models to specific tasks, particularly in the realm of parameter-efficient tuning. However, existing visual prompting techniques often pad the prompt parameters around the image, limiting the interaction between the visual prompts and the original image to a small set of patches while neglecting the inductive bias present in shared information across different patches. In this study, we conduct a thorough preliminary investigation to identify and address these limitations. We propose a novel visual prompt design, introducing **Lo**w-**R**ank matrix multiplication for **V**isual **P**rompting (LoR-VP), which enables shared and patch-specific information across rows and columns of image pixels. Extensive experiments across seven network architectures and four datasets demonstrate significant improvements in both performance and efficiency compared to state-of-the-art visual prompting methods, achieving up to $6\times$ faster training times, utilizing $18\times$ fewer visual prompt parameters, and delivering a 3.1% improvement in performance.
Cite
Text
Jin et al. "LoR-VP: Low-Rank Visual Prompting for Efficient Vision Model Adaptation." International Conference on Learning Representations, 2025.Markdown
[Jin et al. "LoR-VP: Low-Rank Visual Prompting for Efficient Vision Model Adaptation." International Conference on Learning Representations, 2025.](https://mlanthology.org/iclr/2025/jin2025iclr-lorvp/)BibTeX
@inproceedings{jin2025iclr-lorvp,
title = {{LoR-VP: Low-Rank Visual Prompting for Efficient Vision Model Adaptation}},
author = {Jin, Can and Li, Ying and Zhao, Mingyu and Zhao, Shiyu and Wang, Zhenting and He, Xiaoxiao and Han, Ligong and Che, Tong and Metaxas, Dimitris N.},
booktitle = {International Conference on Learning Representations},
year = {2025},
url = {https://mlanthology.org/iclr/2025/jin2025iclr-lorvp/}
}