L3TC: Leveraging RWKV for Learned Lossless Low-Complexity Text Compression

Abstract

Learning-based probabilistic models can be combined with an entropy coder for data compression. However, due to the high complexity of learning-based models, their practical application as text compressors has been largely overlooked. To address this issue, our work focuses on a low-complexity design while maintaining compression performance. We introduce a novel Learned Lossless Low-complexity Text Compression method (L3TC). Specifically, we conduct extensive experiments demonstrating that RWKV models achieve the fastest decoding speed with a moderate compression ratio, making it the most suitable backbone for our method. Second, we propose an outlier-aware tokenizer that uses a limited vocabulary to cover frequent tokens while allowing outliers to bypass the prediction and encoding. Third, we propose a novel high-rank reparameterization strategy that enhances the learning capability during training without increasing complexity during inference. Experimental results validate that our method achieves 48% bit saving compared to gzip compressor. Besides, L3TC offers compression performance comparable to other learned compressors, with a 50x reduction in model parameters. More importantly, L3TC is the fastest among all learned compressors, providing real-time decoding speeds up to megabytes per second.

Cite

Text

Zhang et al. "L3TC: Leveraging RWKV for Learned Lossless Low-Complexity Text Compression." AAAI Conference on Artificial Intelligence, 2025. doi:10.1609/AAAI.V39I12.33446

Markdown

[Zhang et al. "L3TC: Leveraging RWKV for Learned Lossless Low-Complexity Text Compression." AAAI Conference on Artificial Intelligence, 2025.](https://mlanthology.org/aaai/2025/zhang2025aaai-l/) doi:10.1609/AAAI.V39I12.33446

BibTeX

@inproceedings{zhang2025aaai-l,
  title     = {{L3TC: Leveraging RWKV for Learned Lossless Low-Complexity Text Compression}},
  author    = {Zhang, Junxuan and Cheng, Zhengxue and Zhao, Yan and Wang, Shihao and Zhou, Dajiang and Lu, Guo and Song, Li},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {2025},
  pages     = {13251-13259},
  doi       = {10.1609/AAAI.V39I12.33446},
  url       = {https://mlanthology.org/aaai/2025/zhang2025aaai-l/}
}