Generalization Error Analysis for Selective State-Space Models Through the Lens of Attention

Abstract

State-space models (SSMs) have recently emerged as a compelling alternative to Transformers for sequence modeling tasks. This paper presents a theoretical generalization analysis of selective SSMs, the core architectural component behind the Mamba model. We derive a novel covering number-based generalization bound for selective SSMs, building upon recent theoretical advances in the analysis of Transformer models. Using this result, we analyze how the spectral abscissa of the continuous-time state matrix influences the model’s stability during training and its ability to generalize across sequence lengths. We empirically validate our findings on a synthetic majority task, the IMDb sentiment classification benchmark, and the ListOps task, demonstrating how our theoretical insights translate into practical model behavior.

Cite

Text

Honarpisheh et al. "Generalization Error Analysis for Selective State-Space Models Through the Lens of Attention." Advances in Neural Information Processing Systems, 2025.

Markdown

[Honarpisheh et al. "Generalization Error Analysis for Selective State-Space Models Through the Lens of Attention." Advances in Neural Information Processing Systems, 2025.](https://mlanthology.org/neurips/2025/honarpisheh2025neurips-generalization/)

BibTeX

@inproceedings{honarpisheh2025neurips-generalization,
  title     = {{Generalization Error Analysis for Selective State-Space Models Through the Lens of Attention}},
  author    = {Honarpisheh, Arya and Bozdag, Mustafa and Camps, Octavia and Sznaier, Mario},
  booktitle = {Advances in Neural Information Processing Systems},
  year      = {2025},
  url       = {https://mlanthology.org/neurips/2025/honarpisheh2025neurips-generalization/}
}