Architecture-Agnostic Unsupervised Gradient Regularization for Parameter-Efficient Transfer Learning
Abstract
The advent of pre-trained foundation models has sparked significant research interest in parameter-efficient transfer learning (PETL), which focuses on the effective adaptation of these models for downstream tasks. Most PETL methods explore adaptive architectures that enhance adaptation performance with a minimal set of learnable parameters. Instead, in this paper, we propose an architecture-agnostic regularization strategy, Unsupervised Gradient Regularization (UGR), by incorporating an unsupervised objective to boost the target supervised task. Specifically, we sequentially process the unsupervised and target tasks within each optimization step, yet the model is updated solely with gradients from the target task. Through theoretical analysis, we demonstrate that this innovative learning strategy implicitly imposes gradient directional consistency between unsupervised and supervised tasks, thereby exerting a potent regularization effect on gradients and enhancing PETL efficacy. Besides, the promoted gradient directional consistency further facilitates the model adaptation at test time. Five UGR variants are instantiated with different unsupervised objectives, such as entropy minimization and reconstruction, demonstrating the compatibility of our framework. Tested on 25 benchmark datasets, we validate the effectiveness of our UGR with three seminal PETL architectures (i.e., VPT, Adapter, and LoRA), illustrating its generalization across different adaptive architectures. Codes are available at https://github.com/ZhuWenjie98/UGR .
Cite
Text
Zhu et al. "Architecture-Agnostic Unsupervised Gradient Regularization for Parameter-Efficient Transfer Learning." European Conference on Computer Vision Workshops, 2024. doi:10.1007/978-3-031-92089-9_20Markdown
[Zhu et al. "Architecture-Agnostic Unsupervised Gradient Regularization for Parameter-Efficient Transfer Learning." European Conference on Computer Vision Workshops, 2024.](https://mlanthology.org/eccvw/2024/zhu2024eccvw-architectureagnostic/) doi:10.1007/978-3-031-92089-9_20BibTeX
@inproceedings{zhu2024eccvw-architectureagnostic,
title = {{Architecture-Agnostic Unsupervised Gradient Regularization for Parameter-Efficient Transfer Learning}},
author = {Zhu, Wenjie and Zhang, Yabin and Wang, Pengfei and Jin, Xin and Zeng, Wenjun and Zhang, Lei},
booktitle = {European Conference on Computer Vision Workshops},
year = {2024},
pages = {318-336},
doi = {10.1007/978-3-031-92089-9_20},
url = {https://mlanthology.org/eccvw/2024/zhu2024eccvw-architectureagnostic/}
}