Enhancing Zeroth-Order Fine-Tuning for Language Models with Low-Rank Structures

Yiming Chen, Yuan Zhang, Liyuan Cao, Kun Yuan, Zaiwen Wen

ICLR 2025

/iclr/2025/chen2025iclr-enhancing/

Abstract

Parameter-efficient fine-tuning (PEFT) significantly reduces memory costs when adapting large language models (LLMs) for downstream applications. However, traditional first-order (FO) fine-tuning algorithms incur substantial memory overhead due to the need to store activation values for back-propagation during gradient computation, particularly in long-context fine-tuning tasks. Zeroth-order (ZO) algorithms offer a promising alternative by approximating gradients using finite differences of function values, thus eliminating the need for activation storage. Nevertheless, existing ZO methods struggle to capture the low-rank gradient structure common in LLM fine-tuning, leading to suboptimal performance. This paper proposes a low-rank ZO gradient estimator and introduces a novel **lo**w-rank **ZO** algorithm (LOZO) that effectively captures this structure in LLMs. We provide convergence guarantees for LOZO by framing it as a subspace optimization method. Additionally, its low-rank nature enables LOZO to integrate with momentum techniques while incurring negligible extra memory costs. Extensive experiments across various model sizes and downstream tasks demonstrate that LOZO and its momentum-based variant outperform existing ZO methods and closely approach the performance of FO algorithms.

PDF ICLR Semantic Scholar

Cite

Text

Chen et al. "Enhancing Zeroth-Order Fine-Tuning for Language Models with Low-Rank Structures." International Conference on Learning Representations, 2025.

Markdown

[Chen et al. "Enhancing Zeroth-Order Fine-Tuning for Language Models with Low-Rank Structures." International Conference on Learning Representations, 2025.](https://mlanthology.org/iclr/2025/chen2025iclr-enhancing/)

BibTeX

@inproceedings{chen2025iclr-enhancing,
  title     = {{Enhancing Zeroth-Order Fine-Tuning for Language Models with Low-Rank Structures}},
  author    = {Chen, Yiming and Zhang, Yuan and Cao, Liyuan and Yuan, Kun and Wen, Zaiwen},
  booktitle = {International Conference on Learning Representations},
  year      = {2025},
  url       = {https://mlanthology.org/iclr/2025/chen2025iclr-enhancing/}
}