Improving Visual Prompt Tuning by Gaussian Neighborhood Minimization for Long-Tailed Visual Recognition

Li, Mengke; Liu, Ye; Lu, Yang; Zhang, Yiqun; Cheung, Yiu-ming; Huang, Hui

doi:10.52202/079017-3304

Improving Visual Prompt Tuning by Gaussian Neighborhood Minimization for Long-Tailed Visual Recognition

Mengke Li, Ye Liu, Yang Lu, Yiqun Zhang, Yiu-ming Cheung, Hui Huang

NeurIPS 2024

doi:10.52202/079017-3304 /neurips/2024/li2024neurips-improving-a/

Abstract

Long-tailed visual recognition has received increasing attention recently. Despite fine-tuning techniques represented by visual prompt tuning (VPT) achieving substantial performance improvement by leveraging pre-trained knowledge, models still exhibit unsatisfactory generalization performance on tail classes. To address this issue, we propose a novel optimization strategy called Gaussian neighborhood minimization prompt tuning (GNM-PT), for VPT to address the long-tail learning problem. We introduce a novel Gaussian neighborhood loss, which provides a tight upper bound on the loss function of data distribution, facilitating a flattened loss landscape correlated to improved model generalization. Specifically, GNM-PT seeks the gradient descent direction within a random parameter neighborhood, independent of input samples, during each gradient update. Ultimately, GNM-PT enhances generalization across all classes while simultaneously reducing computational overhead. The proposed GNM-PT achieves state-of-the-art classification accuracies of 90.3%, 76.5%, and 50.1% on benchmark datasets CIFAR100-LT (IR 100), iNaturalist 2018, and Places-LT, respectively. The source code is available at https://github.com/Keke921/GNM-PT.

PDF NeurIPS OpenReview Semantic Scholar

Cite

Text

Li et al. "Improving Visual Prompt Tuning by Gaussian Neighborhood Minimization for Long-Tailed Visual Recognition." Neural Information Processing Systems, 2024. doi:10.52202/079017-3304

Markdown

[Li et al. "Improving Visual Prompt Tuning by Gaussian Neighborhood Minimization for Long-Tailed Visual Recognition." Neural Information Processing Systems, 2024.](https://mlanthology.org/neurips/2024/li2024neurips-improving-a/) doi:10.52202/079017-3304

BibTeX

@inproceedings{li2024neurips-improving-a,
  title     = {{Improving Visual Prompt Tuning by Gaussian Neighborhood Minimization for Long-Tailed Visual Recognition}},
  author    = {Li, Mengke and Liu, Ye and Lu, Yang and Zhang, Yiqun and Cheung, Yiu-ming and Huang, Hui},
  booktitle = {Neural Information Processing Systems},
  year      = {2024},
  doi       = {10.52202/079017-3304},
  url       = {https://mlanthology.org/neurips/2024/li2024neurips-improving-a/}
}