TicketLLM: Next-Generation Sparse and Low-Bit Transformers with Supermask-Based Method

Abstract

Strong Lottery Tickets (SLTs) are subnetworks within a randomly weighted network uncovered by a binary mask called supermask. They offer a promising approach to model compression by eliminating the need to store weights since their effective subnetwork can be regenerated from a fixed random seed and the supermask. However, extending this approach to large language models (LLMs) is non-trivial due to limited scalability and inefficient training dynamics of existing SLT methods. To address these challenges, we propose Adaptive Supermask (Ada-Sup), a scalable and efficient method for discovering high-quality multi-bit supermasks through an innovative quantization-based approach. Building on this method, we introduce TicketLLM, a low-bit and sparse Transformer-based LLM architecture powered by Ada-Sup. Experimental results show that Ada-Sup can discover high-quality supermasks with significantly reduced training costs compared to previous methods in both binary and multi-bit settings. Furthermore, TicketLLM outperforms BitNet b1.58 on a 1.3B parameter model with the same memory per connection, achieving 0.6% reduction in perplexity (from 13.62 to 13.54) while operating at a higher sparsity level (around 50% vs. around 33%). These results highlight the potential of supermask-based methods as a promising approach for building lightweight LLMs.

Cite

Text

Okoshi et al. "TicketLLM: Next-Generation Sparse and Low-Bit Transformers with Supermask-Based Method." Transactions on Machine Learning Research, 2025.

Markdown

[Okoshi et al. "TicketLLM: Next-Generation Sparse and Low-Bit Transformers with Supermask-Based Method." Transactions on Machine Learning Research, 2025.](https://mlanthology.org/tmlr/2025/okoshi2025tmlr-ticketllm/)

BibTeX

@article{okoshi2025tmlr-ticketllm,
  title     = {{TicketLLM: Next-Generation Sparse and Low-Bit Transformers with Supermask-Based Method}},
  author    = {Okoshi, Yasuyuki and Otsuka, Hikari and Fujiki, Daichi and Motomura, Masato},
  journal   = {Transactions on Machine Learning Research},
  year      = {2025},
  url       = {https://mlanthology.org/tmlr/2025/okoshi2025tmlr-ticketllm/}
}