Towards Effective Foundation Model Adaptation for Extreme Cross-Domain Few-Shot Learning
Abstract
Large-scale pre-trained foundation models have demonstrated remarkable generalization capabilities across diverse computer vision tasks through fine-tuning. However, existing fine-tuning approaches often encounter challenges in extreme cross-domain few-shot learning scenarios, primarily due to the significant domain shift between pre-training data and target tasks, as well as the scarcity of annotated target samples. To mitigate this issue, we propose a novel absorption adaptation learning framework which meticulously regularizes the fine-tuning procedure of foundation model using an expert model with the same architecture but trained from scratch on the targeted data in two aspects. On one hand, we first design a masked cross-model unidirectional reconstruction scheme, which forces the foundation model to recover the intermediate feature of the expert model in a randomly masked manner. On the other hand, a decision graph association loss is developed to encourage the consistency of token similarity matrix between these two models. By doing these, the task-relevant semantic knowledge in the expert model from both intermediate feature and the final decision levels are appropriately extracted and absorbed by the foundation model during its fine-tuning, thus mitigating the performance drop caused by domain gap and limited annotation. Sufficient experiments with further observations and analyses underpin our observation and argument. The code is available at https://github.com/NWPUZhoufei/FMA.
Cite
Text
Zhou et al. "Towards Effective Foundation Model Adaptation for Extreme Cross-Domain Few-Shot Learning." International Conference on Computer Vision, 2025.Markdown
[Zhou et al. "Towards Effective Foundation Model Adaptation for Extreme Cross-Domain Few-Shot Learning." International Conference on Computer Vision, 2025.](https://mlanthology.org/iccv/2025/zhou2025iccv-effective/)BibTeX
@inproceedings{zhou2025iccv-effective,
title = {{Towards Effective Foundation Model Adaptation for Extreme Cross-Domain Few-Shot Learning}},
author = {Zhou, Fei and Wang, Peng and Zhang, Lei and Wei, Wei and Ding, Chen and Lin, Guosheng and Zhang, Yanning},
booktitle = {International Conference on Computer Vision},
year = {2025},
pages = {4582-4593},
url = {https://mlanthology.org/iccv/2025/zhou2025iccv-effective/}
}