Mamba State-Space Models Are Lyapunov-Stable Learners

John Timothy Halloran, Manbir S Gulati, Paul F Roysdon

ICLRW 2025

/iclrw/2025/halloran2025iclrw-mamba/

Abstract

Compute-efficient methods–e.g., mixed-precision fine-tuning (MPFT) and parameter-efficient fine-tuning (PEFT)–have become standard tools for Transformer-based large language models (LLMs). While near-ubiquitously adapted, we empirically show that, under different combinations of MPFT and PEFT, Transformer LLMs may drastically diverge from their respective full-precision counterparts. In stark contrast, we show that recent Mamba LLMs based on state-space models (SSMs) are significantly more stable to changes introduced by combinations of MPFT and PEFT. This robustness is due to the recurrent dynamics of Mamba SSMs, which we prove are guaranteed to be stable using dynamical systems theory (in particular, Lyapunov exponents). Additionally, we demonstrate how targeting different Mamba parameters for low-rank adaptation provides regularization and impacts PEFT generalization. We conclude by using MPFT and PEFT to novelly study Mamba LLMs’ in-context learning (ICL) abilities on natural language tasks, thus supplementing other recent work.

PDF ICLRW OpenReview Semantic Scholar

Cite

Text

Halloran et al. "Mamba State-Space Models Are Lyapunov-Stable Learners." ICLR 2025 Workshops: DeLTa, 2025.

Markdown

[Halloran et al. "Mamba State-Space Models Are Lyapunov-Stable Learners." ICLR 2025 Workshops: DeLTa, 2025.](https://mlanthology.org/iclrw/2025/halloran2025iclrw-mamba/)

BibTeX

@inproceedings{halloran2025iclrw-mamba,
  title     = {{Mamba State-Space Models Are Lyapunov-Stable Learners}},
  author    = {Halloran, John Timothy and Gulati, Manbir S and Roysdon, Paul F},
  booktitle = {ICLR 2025 Workshops: DeLTa},
  year      = {2025},
  url       = {https://mlanthology.org/iclrw/2025/halloran2025iclrw-mamba/}
}