PROFIT: A Specialized Optimizer for Deep Fine Tuning

Abstract

The fine-tuning of pre-trained models has become ubiquitous in generative AI, computer vision, and robotics. Although much attention has been paid to improving the efficiency of fine-tuning model, there has been less scholarship around fine-tuning specifically for improved model performance. To remedy this gap, we present PROFIT, one of the first optimizers designed to incrementally fine-tune converged models on new tasks and/or datasets. Unlike traditional optimizers such as SGD or Adam, which make minimal assumptions due to random initializations, PROFIT takes the properties of a converged model into account explicitly to regularize the optimization process. Employing a temporal gradient-orthogonalization process, PROFIT outperforms fine-tuning methods in various tasks, from image classification to multimodal language model training to large-scale motion prediction. Moreover, PROFIT is encapsulated as a modular optimizer, which makes it easy to integrate directly into any training pipeline with minimal engineering effort.

Cite

Text

Chakravarthy et al. "PROFIT: A Specialized Optimizer for Deep Fine Tuning." Advances in Neural Information Processing Systems, 2025.

Markdown

[Chakravarthy et al. "PROFIT: A Specialized Optimizer for Deep Fine Tuning." Advances in Neural Information Processing Systems, 2025.](https://mlanthology.org/neurips/2025/chakravarthy2025neurips-profit/)

BibTeX

@inproceedings{chakravarthy2025neurips-profit,
  title     = {{PROFIT: A Specialized Optimizer for Deep Fine Tuning}},
  author    = {Chakravarthy, Anirudh S and Zheng, Shuai Kyle and Huang, Xin and Hemachandra, Sachithra and Zhang, Xiao and Chai, Yuning and Chen, Zhao},
  booktitle = {Advances in Neural Information Processing Systems},
  year      = {2025},
  url       = {https://mlanthology.org/neurips/2025/chakravarthy2025neurips-profit/}
}