Long-Context Linear System Identification

Abstract

This paper addresses the problem of long-context linear system identification, where the state $x_t$ of the system at time $t$ depends linearly on previous states $x_s$ over a fixed context window of length $p$. We establish a sample complexity bound that matches the _i.i.d._ parametric rate, up to logarithmic factors for a broad class of systems, extending previous work that considered only first-order dependencies. Our findings reveal a ``learning-without-mixing'' phenomenon, indicating that learning long-context linear autoregressive models is not hindered by slow mixing properties potentially associated with extended context windows. Additionally, we extend these results to _(i)_ shared low-rank feature representations, where rank-regularized estimators improve rates with respect to dimensionality, and _(ii)_ misspecified context lengths in strictly stable systems, where shorter contexts offer statistical advantages.

Cite

Text

Yüksel et al. "Long-Context Linear System Identification." International Conference on Learning Representations, 2025.

Markdown

[Yüksel et al. "Long-Context Linear System Identification." International Conference on Learning Representations, 2025.](https://mlanthology.org/iclr/2025/yuksel2025iclr-longcontext/)

BibTeX

@inproceedings{yuksel2025iclr-longcontext,
  title     = {{Long-Context Linear System Identification}},
  author    = {Yüksel, Oğuz Kaan and Even, Mathieu and Flammarion, Nicolas},
  booktitle = {International Conference on Learning Representations},
  year      = {2025},
  url       = {https://mlanthology.org/iclr/2025/yuksel2025iclr-longcontext/}
}