StableToken: A Noise-Robust Semantic Speech Tokenizer for Resilient SpeechLLMs

Abstract

Prevalent semantic speech tokenizers, designed to capture linguistic content, are surprisingly fragile. We find they are not robust to meaning-irrelevant acoustic perturbations; even at high Signal-to-Noise Ratios (SNRs) where speech is perfectly intelligible, their output token sequences can change drastically, increasing the learning burden for downstream LLMs. This instability stems from two flaws: a brittle single-path quantization architecture and a distant training signal indifferent to intermediate token stability. To address this, we introduce StableToken, a tokenizer that achieves stability through a consensus-driven mechanism. Its multi-branch architecture processes audio in parallel, and these representations are merged via a powerful bit-wise voting mechanism to form a single, stable token sequence. StableToken sets a new state-of-the-art in token stability, drastically reducing Unit Edit Distance (UED) under diverse noise conditions. This foundational stability translates directly to downstream benefits, significantly improving the robustness of SpeechLLMs on a variety of tasks. Our code and model are publicly available at https://github.com/Tencent/StableToken.

Cite

Text

Song et al. "StableToken: A Noise-Robust Semantic Speech Tokenizer for Resilient SpeechLLMs." International Conference on Learning Representations, 2026.

Markdown

[Song et al. "StableToken: A Noise-Robust Semantic Speech Tokenizer for Resilient SpeechLLMs." International Conference on Learning Representations, 2026.](https://mlanthology.org/iclr/2026/song2026iclr-stabletoken/)

BibTeX

@inproceedings{song2026iclr-stabletoken,
  title     = {{StableToken: A Noise-Robust Semantic Speech Tokenizer for Resilient SpeechLLMs}},
  author    = {Song, Yuhan and Zhang, Linhao and Wu, Chuhan and Liu, Aiwei and Jia, Wei and Wang, Houfeng and Xiao, Zhou},
  booktitle = {International Conference on Learning Representations},
  year      = {2026},
  url       = {https://mlanthology.org/iclr/2026/song2026iclr-stabletoken/}
}