Streamlining Language Models via Semantic Basis Analysis
Abstract
As the size of language models increases, they deliver substantial performance improvements across a variety of applications. However, this growth also leads to greater computational demands, making deployment on resource-constrained devices—such as personal computers and mobile or wearable devices—more challenging, and significantly raising inference costs on cloud servers. To address these challenges, we introduce Basel, a method to streamline language models by leveraging the semantic structure of their weight matrices. Specifically, Basel treats each weight matrix as a linear combination of bases, selectively retaining those that are associated with essential semantics for the target application, pruning redundant ones, and introducing new bases that enhance task performance. Experimental results demonstrate that Basel achieves significant model size reduction compared to baseline techniques, while maintaining comparable or even superior accuracy across diverse applications.
Cite
Text
Li et al. "Streamlining Language Models via Semantic Basis Analysis." Transactions on Machine Learning Research, 2025.Markdown
[Li et al. "Streamlining Language Models via Semantic Basis Analysis." Transactions on Machine Learning Research, 2025.](https://mlanthology.org/tmlr/2025/li2025tmlr-streamlining/)BibTeX
@article{li2025tmlr-streamlining,
title = {{Streamlining Language Models via Semantic Basis Analysis}},
author = {Li, Yang and Asante, Daniel Agyei and Zhao, Changsheng and Chang, Ernie and Shi, Yangyang and Chandra, Vikas},
journal = {Transactions on Machine Learning Research},
year = {2025},
url = {https://mlanthology.org/tmlr/2025/li2025tmlr-streamlining/}
}