Towards Efficient Low-Order Hybrid Optimizer for Language Model Fine-Tuning
Abstract
As the size of language models notably grows, fine-tuning the models becomes more challenging: fine-tuning with first-order optimizers (e.g., SGD and Adam) requires high memory consumption, while fine-tuning with a memory-efficient zeroth-order optimizer (MeZO) has a significant accuracy drop and slower convergence rate. In this work, we propose a Low order Hybrid Optimizer (LoHO) which merges zeroth-order (ZO) and first-order (FO) optimizers for fine-tuning. LoHO is empowered with inter-layer hybrid optimization and intra-layer hybrid optimization, which boosts the accuracy of MeZO while keeping memory usage within a budget. The inter-layer hybrid optimization exploits the FO optimizer in deep layers and the ZO optimizer in shallow ones, therefore avoiding unnecessary gradient propagation to improve memory efficiency. The intra-layer hybrid optimization updates a proportion of parameters in a layer by the ZO optimizer, and the rest by the FO optimizer, taking advantage of gradient sparsity for high efficiency implementation. Our experimental results across common datasets on different pre-trained backbones (i.e., RoBERTa-large, OPT-13B and OPT-30B) demonstrate that LoHO can significantly improve the predictive accuracy and convergence rate of MeZO, while controlling the memory footprint during fine-tuning. Moreover, LoHO can achieve comparable performance with first-order fine-tuning using substantially fewer memory resources.
Cite
Text
Chen et al. "Towards Efficient Low-Order Hybrid Optimizer for Language Model Fine-Tuning." AAAI Conference on Artificial Intelligence, 2025. doi:10.1609/AAAI.V39I22.34530Markdown
[Chen et al. "Towards Efficient Low-Order Hybrid Optimizer for Language Model Fine-Tuning." AAAI Conference on Artificial Intelligence, 2025.](https://mlanthology.org/aaai/2025/chen2025aaai-efficient/) doi:10.1609/AAAI.V39I22.34530BibTeX
@inproceedings{chen2025aaai-efficient,
title = {{Towards Efficient Low-Order Hybrid Optimizer for Language Model Fine-Tuning}},
author = {Chen, Minping and Huang, You-Liang and Wen, Zeyi},
booktitle = {AAAI Conference on Artificial Intelligence},
year = {2025},
pages = {23605-23613},
doi = {10.1609/AAAI.V39I22.34530},
url = {https://mlanthology.org/aaai/2025/chen2025aaai-efficient/}
}