Towards a Theoretical Understanding of In-Context Learning: Stability and Non-I.I.D Generalisation

Abstract

In-context learning (ICL) has demonstrated significant performance improvements in transformer-based large models. This study identifies two key factors influencing ICL generalisation under complex non-i.i.d. scenario: algorithmic stability and distributional discrepancy. First, we establish a stability bound for transformer-based models trained with mini-batch gradient descent, revealing how specific optimization configurations interact with the smoothness of the loss landscape to ensure the stability of non-linear Transformers. Next, we introduce a distribution-level discrepancy measure that highlights the importance of aligning the ICL prompt distribution with the training data distribution to achieve effective generalisation. Building on these insights, we derive a generalisation error bound for ICL with asymptotic convergence guarantees, which further reveals that token-wise prediction errors accumulate over time and even lead to generalisation collapse if the prediction length is not properly constrained. Finally, empirical evaluations are provided to validate our theoretical findings.

Cite

Text

Wang et al. "Towards a Theoretical Understanding of In-Context Learning: Stability and Non-I.I.D Generalisation." International Conference on Learning Representations, 2026.

Markdown

[Wang et al. "Towards a Theoretical Understanding of In-Context Learning: Stability and Non-I.I.D Generalisation." International Conference on Learning Representations, 2026.](https://mlanthology.org/iclr/2026/wang2026iclr-theoretical/)

BibTeX

@inproceedings{wang2026iclr-theoretical,
  title     = {{Towards a Theoretical Understanding of In-Context Learning: Stability and Non-I.I.D Generalisation}},
  author    = {Wang, Yingjie and Zhou, Yutian and Fu, Shi and Chen, Yuzhu and Jing, Yongcheng and Rutkowski, Leszek and Tao, Dacheng},
  booktitle = {International Conference on Learning Representations},
  year      = {2026},
  url       = {https://mlanthology.org/iclr/2026/wang2026iclr-theoretical/}
}