AutoFT: Robust Fine-Tuning by Optimizing Hyperparameters on OOD Data

Abstract

Foundation models encode a rich representation that can be adapted to a desired task by fine-tuning on task-specific data. However, fine-tuning a model on one particular data distribution often compromises the model's original performance on other distributions. Current methods for robust fine-tuning utilize various hand-crafted regularization techniques to constrain the fine-tuning process towards the base foundation model. Yet, it is hard to directly specify what characteristics of the foundation model to retain during fine-tuning, as this is influenced by the complex interplay between the pre-training, fine-tuning, and evaluation distributions. We propose AutoFT, a data-driven method for guiding foundation model adaptation: optimizing hyperparameters for fine-tuning with respect to post-adaptation performance on a small out-of-distribution (OOD) validation set. We find that when optimizing hyperparameters for OOD generalization, it is especially beneficial to use a highly expressive hyperparameter space such as per-layer learning rates and loss weight coefficients. Our evaluation demonstrates state-of-the-art performance on OOD distributions unseen during fine-tuning and hyperparameter optimization.

Cite

Text

Choi et al. "AutoFT: Robust Fine-Tuning by Optimizing Hyperparameters on OOD Data." NeurIPS 2023 Workshops: DistShift, 2023.

Markdown

[Choi et al. "AutoFT: Robust Fine-Tuning by Optimizing Hyperparameters on OOD Data." NeurIPS 2023 Workshops: DistShift, 2023.](https://mlanthology.org/neuripsw/2023/choi2023neuripsw-autoft/)

BibTeX

@inproceedings{choi2023neuripsw-autoft,
  title     = {{AutoFT: Robust Fine-Tuning by Optimizing Hyperparameters on OOD Data}},
  author    = {Choi, Caroline and Lee, Yoonho and Chen, Annie S and Zhou, Allan and Raghunathan, Aditi and Finn, Chelsea},
  booktitle = {NeurIPS 2023 Workshops: DistShift},
  year      = {2023},
  url       = {https://mlanthology.org/neuripsw/2023/choi2023neuripsw-autoft/}
}