AutoFT: Robust Fine-Tuning by Optimizing Hyperparameters on OOD Data
Abstract
Foundation models encode a rich representation that can be adapted to a desired task by fine-tuning on task-specific data. However, fine-tuning a model on one particular data distribution often compromises the model's original performance on other distributions. Current methods for robust fine-tuning utilize various hand-crafted regularization techniques to constrain the fine-tuning process towards the base foundation model. Yet, it is hard to directly specify what characteristics of the foundation model to retain during fine-tuning, as this is influenced by the complex interplay between the pre-training, fine-tuning, and evaluation distributions. We propose AutoFT, a data-driven method for guiding foundation model adaptation: optimizing hyperparameters for fine-tuning with respect to post-adaptation performance on a small out-of-distribution (OOD) validation set. We find that when optimizing hyperparameters for OOD generalization, it is especially beneficial to use a highly expressive hyperparameter space such as per-layer learning rates and loss weight coefficients. Our evaluation demonstrates state-of-the-art performance on OOD distributions unseen during fine-tuning and hyperparameter optimization.
Cite
Text
Choi et al. "AutoFT: Robust Fine-Tuning by Optimizing Hyperparameters on OOD Data." NeurIPS 2023 Workshops: DistShift, 2023.Markdown
[Choi et al. "AutoFT: Robust Fine-Tuning by Optimizing Hyperparameters on OOD Data." NeurIPS 2023 Workshops: DistShift, 2023.](https://mlanthology.org/neuripsw/2023/choi2023neuripsw-autoft/)BibTeX
@inproceedings{choi2023neuripsw-autoft,
title = {{AutoFT: Robust Fine-Tuning by Optimizing Hyperparameters on OOD Data}},
author = {Choi, Caroline and Lee, Yoonho and Chen, Annie S and Zhou, Allan and Raghunathan, Aditi and Finn, Chelsea},
booktitle = {NeurIPS 2023 Workshops: DistShift},
year = {2023},
url = {https://mlanthology.org/neuripsw/2023/choi2023neuripsw-autoft/}
}