BackSlash: Rate Constrained Optimized Training of Large Language Models

Abstract

The rapid advancement of large-language models (LLMs) has driven extensive research into parameter compression after training has been completed, yet compression during the training phase remains largely unexplored. In this work, we introduce Rate-Constrained Training (BackSlash), a novel training-time compression approach based on rate-distortion optimization (RDO). BackSlash enables a flexible trade-off between model accuracy and complexity, significantly reducing parameter redundancy while preserving performance. Experiments in various architectures and tasks demonstrate that BackSlash can reduce memory usage by 60% - 90% without accuracy loss and provides significant compression gain compared to compression after training. Moreover, BackSlash proves to be highly versatile: it enhances generalization with small Lagrange multipliers, improves model robustness to pruning (maintaining accuracy even at 80% pruning rates), and enables network simplification for accelerated inference on edge devices.

Cite

Text

Wu et al. "BackSlash: Rate Constrained Optimized Training of Large Language Models." Proceedings of the 42nd International Conference on Machine Learning, 2025.

Markdown

[Wu et al. "BackSlash: Rate Constrained Optimized Training of Large Language Models." Proceedings of the 42nd International Conference on Machine Learning, 2025.](https://mlanthology.org/icml/2025/wu2025icml-backslash/)

BibTeX

@inproceedings{wu2025icml-backslash,
  title     = {{BackSlash: Rate Constrained Optimized Training of Large Language Models}},
  author    = {Wu, Jun and Wen, Jiangtao and Han, Yuxing},
  booktitle = {Proceedings of the 42nd International Conference on Machine Learning},
  year      = {2025},
  pages     = {67852-67863},
  volume    = {267},
  url       = {https://mlanthology.org/icml/2025/wu2025icml-backslash/}
}