Semantic-Guided Robustness Tuning for Few-Shot Transfer Across Extreme Domain Shift
Abstract
In this work, we focus on the cross-domain few-shot classification (CDFSC), which is mostly challenged by the low-data problem as well as extreme domain shift between base and novel target classes. Current methods always employ a lightweight backbone and continue to use a linear-probe-like traditional fine-tuning (Trad-FT) paradigm. While for recently emerging large-scale pre-trained model (LPM), which has more parameters with considerable prior knowledge, employing Trad-FT will face significant risks of overfitting and prior knowledge damage. In this paper, we propose semantic-guided robustness tuning (SRT), a novel fine-tuning paradigm including modulus-matching-based image-text mixup (MMIT-Mixup) and robustness-invariance fine-tuning (RI-FT), to address the CDFSC challenge of LPM. Concretely, SRT focuses on achieving robust class-specific representation. It first considers textual information as a robust and domain-invariant conductor, and MMIT-Mixup injects the domain-invariant and class-specific knowledge to obtain domain-invariant prototypes. Then, RI-FT optimizes the distance between features and prototypes to enhance the robustness of visual-encoder. We consider several types of LPMs and conduct extensive experiments, which reveals that SRT is a general solution for LPM’s CDFSC challenge and outperforms the existing methods with a large margin.
Cite
Text
Xiao et al. "Semantic-Guided Robustness Tuning for Few-Shot Transfer Across Extreme Domain Shift." Proceedings of the European Conference on Computer Vision (ECCV), 2024. doi:10.1007/978-3-031-72967-6_17Markdown
[Xiao et al. "Semantic-Guided Robustness Tuning for Few-Shot Transfer Across Extreme Domain Shift." Proceedings of the European Conference on Computer Vision (ECCV), 2024.](https://mlanthology.org/eccv/2024/xiao2024eccv-semanticguided/) doi:10.1007/978-3-031-72967-6_17BibTeX
@inproceedings{xiao2024eccv-semanticguided,
title = {{Semantic-Guided Robustness Tuning for Few-Shot Transfer Across Extreme Domain Shift}},
author = {Xiao, Kangyu and Wang, Zilei and Li, Junjie},
booktitle = {Proceedings of the European Conference on Computer Vision (ECCV)},
year = {2024},
doi = {10.1007/978-3-031-72967-6_17},
url = {https://mlanthology.org/eccv/2024/xiao2024eccv-semanticguided/}
}