Context-Aware Dynamic Pruning for Speech Foundation Models

Abstract

Foundation models, such as large language models, have achieved remarkable success in natural language processing and are evolving into models capable of handling multiple modalities. Listening ability, in particular, is crucial for many applications, leading to research on building speech foundation models. However, the high computational cost of these large models presents a significant challenge for real-world applications. Although substantial efforts have been made to reduce computational costs, such as through pruning techniques, the majority of these approaches are applied primarily during the training phase for specific downstream tasks. In this study, we hypothesize that optimal pruned networks may vary based on contextual factors such as speaker characteristics, languages, and tasks. To address this, we propose a dynamic pruning technique that adapts to these contexts during inference without altering the underlying model. We demonstrated that we could successfully reduce inference time by approximately 30\% while maintaining accuracy in multilingual/multi-task scenarios. We also found that the obtained pruned structure offers meaningful interpretations based on the context, e.g., task-related information emerging as the dominant factor for efficient pruning.

Cite

Text

Someki et al. "Context-Aware Dynamic Pruning for Speech Foundation Models." International Conference on Learning Representations, 2025.

Markdown

[Someki et al. "Context-Aware Dynamic Pruning for Speech Foundation Models." International Conference on Learning Representations, 2025.](https://mlanthology.org/iclr/2025/someki2025iclr-contextaware/)

BibTeX

@inproceedings{someki2025iclr-contextaware,
  title     = {{Context-Aware Dynamic Pruning for Speech Foundation Models}},
  author    = {Someki, Masao and Peng, Yifan and Arora, Siddhant and Müller, Markus and Mouchtaris, Athanasios and Strimel, Grant and Liu, Jing and Watanabe, Shinji},
  booktitle = {International Conference on Learning Representations},
  year      = {2025},
  url       = {https://mlanthology.org/iclr/2025/someki2025iclr-contextaware/}
}