Memory-Efficient Large Language Model (LLM) Training and Fine-Tuning via Gradient Subspace Tracking

Abstract

Training and fine-tuning Large Language Models (LLMs) is often highly resource- and time-intensive due to their large model sizes. To address this issue and improve accessibility, several memory-efficient techniques have been developed, such as Low-Rank Adaptation (LoRA), which optimizes the weights in a low-rank subspace, and Gradient Low-Rank Projection (GaLore), which projects gradients onto a lower-dimensional space. In this paper, we introduce Gradient Subspace Tracking (SubTrack), a method that restricts the optimization process to a small core subspace of gradient matrices while dynamically tracking subspace changes. By leveraging estimation errors and previously detected subspaces, SubTrack adjusts the subspace estimation using a computationally efficient approach. Despite applying only rank-1 updates, SubTrack achieves performance comparable to, or better than, GaLore while reducing runtime by up to 20.56%.

Cite

Text

Rajabi and Rambhatla. "Memory-Efficient Large Language Model (LLM) Training and Fine-Tuning via Gradient Subspace Tracking." NeurIPS 2024 Workshops: OPT, 2024.

Markdown

[Rajabi and Rambhatla. "Memory-Efficient Large Language Model (LLM) Training and Fine-Tuning via Gradient Subspace Tracking." NeurIPS 2024 Workshops: OPT, 2024.](https://mlanthology.org/neuripsw/2024/rajabi2024neuripsw-memoryefficient/)

BibTeX

@inproceedings{rajabi2024neuripsw-memoryefficient,
  title     = {{Memory-Efficient Large Language Model (LLM) Training and Fine-Tuning via Gradient Subspace Tracking}},
  author    = {Rajabi, Sahar and Rambhatla, Sirisha},
  booktitle = {NeurIPS 2024 Workshops: OPT},
  year      = {2024},
  url       = {https://mlanthology.org/neuripsw/2024/rajabi2024neuripsw-memoryefficient/}
}