Towards Efficient Low-Order Hybrid Optimizer for Language Model Fine-Tuning

AAAI 2025 pp. 23605-23613

doi:10.1609/AAAI.V39I22.34530 /aaai/2025/chen2025aaai-efficient/

Abstract

As the size of language models notably grows, fine-tuning the models becomes more challenging: fine-tuning with first-order optimizers (e.g., SGD and Adam) requires high memory consumption, while fine-tuning with a memory-efficient zeroth-order optimizer (MeZO) has a significant accuracy drop and slower convergence rate. In this work, we propose a Low order Hybrid Optimizer (LoHO) which merges zeroth-order (ZO) and first-order (FO) optimizers for fine-tuning. LoHO is empowered with inter-layer hybrid optimization and intra-layer hybrid optimization, which boosts the accuracy of MeZO while keeping memory usage within a budget. The inter-layer hybrid optimization exploits the FO optimizer in deep layers and the ZO optimizer in shallow ones, therefore avoiding unnecessary gradient propagation to improve memory efficiency. The intra-layer hybrid optimization updates a proportion of parameters in a layer by the ZO optimizer, and the rest by the FO optimizer, taking advantage of gradient sparsity for high efficiency implementation. Our experimental results across common datasets on different pre-trained backbones (i.e., RoBERTa-large, OPT-13B and OPT-30B) demonstrate that LoHO can significantly improve the predictive accuracy and convergence rate of MeZO, while controlling the memory footprint during fine-tuning. Moreover, LoHO can achieve comparable performance with first-order fine-tuning using substantially fewer memory resources.

PDF AAAI Semantic Scholar

Cite

Text

Chen et al. "Towards Efficient Low-Order Hybrid Optimizer for Language Model Fine-Tuning." AAAI Conference on Artificial Intelligence, 2025. doi:10.1609/AAAI.V39I22.34530

Markdown

[Chen et al. "Towards Efficient Low-Order Hybrid Optimizer for Language Model Fine-Tuning." AAAI Conference on Artificial Intelligence, 2025.](https://mlanthology.org/aaai/2025/chen2025aaai-efficient/) doi:10.1609/AAAI.V39I22.34530

BibTeX

@inproceedings{chen2025aaai-efficient,
  title     = {{Towards Efficient Low-Order Hybrid Optimizer for Language Model Fine-Tuning}},
  author    = {Chen, Minping and Huang, You-Liang and Wen, Zeyi},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {2025},
  pages     = {23605-23613},
  doi       = {10.1609/AAAI.V39I22.34530},
  url       = {https://mlanthology.org/aaai/2025/chen2025aaai-efficient/}
}