Comparing the Learning Dynamics of In-Context Learning and Fine-Tuning in Language Models

Abstract

Pretrained language models can acquire novel tasks either through in-context learning (ICL)---adapting behavior via activations without weight updates---or through supervised fine-tuning (SFT), where parameters are explicitly updated. Prior work has reported differences in their generalization performance and inductive biases, but the origins of these differences remain poorly understood. In this work, we treat ICL and SFT as distinct learning algorithms and directly compare the learning dynamics they induce across medium-sized models, analyzing both the evolution of their inductive biases and the underlying internal representations. We find that ICL preserves rich input representations but imposes stronger priors inherited from pretraining, whereas SFT suppresses task-irrelevant features---potentially explaining its weaker generalization in few-shot regimes. These results highlight a mechanistic distinction between context-driven and weight-driven learning.

Cite

Text

Confavreux et al. "Comparing the Learning Dynamics of In-Context Learning and Fine-Tuning in Language Models." International Conference on Learning Representations, 2026.

Markdown

[Confavreux et al. "Comparing the Learning Dynamics of In-Context Learning and Fine-Tuning in Language Models." International Conference on Learning Representations, 2026.](https://mlanthology.org/iclr/2026/confavreux2026iclr-comparing/)

BibTeX

@inproceedings{confavreux2026iclr-comparing,
  title     = {{Comparing the Learning Dynamics of In-Context Learning and Fine-Tuning in Language Models}},
  author    = {Confavreux, Basile and Singh, Aaditya K and Lee, Jin Hwa and Sabran, Amaury and Saxe, Andrew M},
  booktitle = {International Conference on Learning Representations},
  year      = {2026},
  url       = {https://mlanthology.org/iclr/2026/confavreux2026iclr-comparing/}
}