Sample-Aware Adaptive Structured Pruning for Large Language Models

Kong, Jun; Ma, Xinge; Wang, Jin; Zhang, Xuejie

doi:10.1609/AAAI.V39I17.33973

Sample-Aware Adaptive Structured Pruning for Large Language Models

Jun Kong, Xinge Ma, Jin Wang, Xuejie Zhang

AAAI 2025 pp. 17938-17946

doi:10.1609/AAAI.V39I17.33973 /aaai/2025/kong2025aaai-sample/

Abstract

Large language models (LLMs) have achieved outstanding performance in natural language processing, but enormous model sizes and high computational costs limit their practical deployment. Structured pruning can effectively reduce the resource demands for deployment by removing redundant model parameters. However, the randomly selected calibration data and fixed single importance estimation metrics in existing structured pruning methods lead to degraded performance of pruned models. This study introduces AdaPruner, a sample-aware adaptive structured pruning framework for LLMs, aiming to optimize the calibration data and importance estimation metrics in the structured pruning process. Specifically, AdaPruner effectively removes redundant parameters from LLMs by constructing a structured pruning solution space and then employing Bayesian optimization to adaptively search for the optimal calibration data and importance estimation metrics. Experimental results show that the AdaPruner outperforms existing structured pruning methods on a family of LLMs with varying pruning ratios, demonstrating its applicability and robustness. Remarkably, at a 20% pruning ratio, the model pruned with AdaPruner maintains 97% of the performance of the unpruned model.

PDF AAAI Semantic Scholar

Cite

Text

Kong et al. "Sample-Aware Adaptive Structured Pruning for Large Language Models." AAAI Conference on Artificial Intelligence, 2025. doi:10.1609/AAAI.V39I17.33973

Markdown

[Kong et al. "Sample-Aware Adaptive Structured Pruning for Large Language Models." AAAI Conference on Artificial Intelligence, 2025.](https://mlanthology.org/aaai/2025/kong2025aaai-sample/) doi:10.1609/AAAI.V39I17.33973

BibTeX

@inproceedings{kong2025aaai-sample,
  title     = {{Sample-Aware Adaptive Structured Pruning for Large Language Models}},
  author    = {Kong, Jun and Ma, Xinge and Wang, Jin and Zhang, Xuejie},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {2025},
  pages     = {17938-17946},
  doi       = {10.1609/AAAI.V39I17.33973},
  url       = {https://mlanthology.org/aaai/2025/kong2025aaai-sample/}
}