Provable Length Generalization in Sequence Prediction via Spectral Filtering
Abstract
We consider the problem of length generalization in sequence prediction. We define a new metric of performance in this setting – the Asymmetric-Regret– which measures regret against a benchmark predictor with longer context length than available to the learner. We continue by studying this concept through the lens of the spectral filter-ing algorithm. We present a gradient-based learn-ing algorithm that provably achieves length generalization for linear dynamical systems. We conclude with proof-of-concept experiments which are consistent with our theory.
Cite
Text
Marsden et al. "Provable Length Generalization in Sequence Prediction via Spectral Filtering." Proceedings of the 42nd International Conference on Machine Learning, 2025.Markdown
[Marsden et al. "Provable Length Generalization in Sequence Prediction via Spectral Filtering." Proceedings of the 42nd International Conference on Machine Learning, 2025.](https://mlanthology.org/icml/2025/marsden2025icml-provable/)BibTeX
@inproceedings{marsden2025icml-provable,
title = {{Provable Length Generalization in Sequence Prediction via Spectral Filtering}},
author = {Marsden, Annie and Dogariu, Evan and Agarwal, Naman and Chen, Xinyi and Suo, Daniel and Hazan, Elad},
booktitle = {Proceedings of the 42nd International Conference on Machine Learning},
year = {2025},
pages = {43200-43224},
volume = {267},
url = {https://mlanthology.org/icml/2025/marsden2025icml-provable/}
}