RILQ: Rank-Insensitive LoRA-Based Quantization Error Compensation for Boosting 2-Bit Large Language Model Accuracy

Abstract

Low-rank adaptation (LoRA) has become the dominant method for parameter-efficient LLM fine-tuning, with LoRA-based quantization error compensation (LQEC) emerging as a powerful tool for recovering accuracy in compressed LLMs. However, LQEC has underperformed in sub-4-bit scenarios, with no prior investigation into understanding this limitation. We propose RILQ (Rank-Insensitive LoRA-based Quantization Error Compensation) to boost 2-bit LLM accuracy. Based on rank analysis revealing model-wise activation discrepancy loss's rank-insensitive nature, RILQ employs this loss to adjust adapters cooperatively across layers, enabling robust error compensation with low-rank adapters. Evaluations on LLaMA-2 and LLaMA-3 demonstrate RILQ's consistent improvements in 2-bit quantized inference across various state-of-the-art quantizers and enhanced accuracy in task-specific fine-tuning. RILQ maintains computational efficiency comparable to existing LoRA methods, enabling adapter-merged weight-quantized LLM inference with significantly enhanced accuracy, making it a promising approach for boosting 2-bit LLM performance.

Cite

Text

Lee et al. "RILQ: Rank-Insensitive LoRA-Based Quantization Error Compensation for Boosting 2-Bit Large Language Model Accuracy." AAAI Conference on Artificial Intelligence, 2025. doi:10.1609/AAAI.V39I17.33990

Markdown

[Lee et al. "RILQ: Rank-Insensitive LoRA-Based Quantization Error Compensation for Boosting 2-Bit Large Language Model Accuracy." AAAI Conference on Artificial Intelligence, 2025.](https://mlanthology.org/aaai/2025/lee2025aaai-rilq/) doi:10.1609/AAAI.V39I17.33990

BibTeX

@inproceedings{lee2025aaai-rilq,
  title     = {{RILQ: Rank-Insensitive LoRA-Based Quantization Error Compensation for Boosting 2-Bit Large Language Model Accuracy}},
  author    = {Lee, Geonho and Lee, Janghwan and Hong, Sukjin and Kim, Minsoo and Ahn, Euijai and Chang, Du-Seong and Choi, Jungwook},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {2025},
  pages     = {18091-18100},
  doi       = {10.1609/AAAI.V39I17.33990},
  url       = {https://mlanthology.org/aaai/2025/lee2025aaai-rilq/}
}