Fixed-Point RNNs: Interpolating from Diagonal to Dense

Abstract

Linear recurrent neural networks (RNNs) and state-space models (SSMs) such as Mamba have become promising alternatives to softmax-attention as sequence mixing layers in Transformer architectures. Current models, however, do not exhibit the full state-tracking expressivity of RNNs because they rely on channel-wise (i.e. diagonal) sequence mixing. In this paper, we investigate parameterizations of a large class of dense linear RNNs as fixed-points of parallelizable diagonal linear RNNs. The resulting models can naturally trade expressivity for efficiency at a fixed number of parameters and achieve state-of-the-art results on the state-tracking benchmarks $A_5$ and $S_5$, while matching performance on copying and other tasks.

PDF NeurIPS OpenReview Semantic Scholar

Cite

Text

Movahedi et al. "Fixed-Point RNNs: Interpolating from Diagonal to Dense." Advances in Neural Information Processing Systems, 2025.

Markdown

[Movahedi et al. "Fixed-Point RNNs: Interpolating from Diagonal to Dense." Advances in Neural Information Processing Systems, 2025.](https://mlanthology.org/neurips/2025/movahedi2025neurips-fixedpoint/)

BibTeX

@inproceedings{movahedi2025neurips-fixedpoint,
  title     = {{Fixed-Point RNNs: Interpolating from Diagonal to Dense}},
  author    = {Movahedi, Sajad and Sarnthein, Felix and Cirone, Nicola Muca and Orvieto, Antonio},
  booktitle = {Advances in Neural Information Processing Systems},
  year      = {2025},
  url       = {https://mlanthology.org/neurips/2025/movahedi2025neurips-fixedpoint/}
}