Tracing the Representation Geometry of Language Models from Pretraining to Post-Training
Abstract
Standard training metrics like loss fail to explain the emergence of complex capabilities in large language models. We take a spectral approach to investigate the geometry of learned representations across pretraining and post-training, measuring effective rank (RankMe) and eigenspectrum decay (αReQ). With OLMo (1B-7B) and Pythia (160M-12B) models, we uncover a consistent non-monotonic sequence of three geometric phases during autoregressive pretraining. The initial “warmup” phase exhibits rapid representational collapse. This is followed by an “entropy-seeking” phase, where the manifold’s dimensionality expands substantially, coinciding with peak n-gram memorization. Subsequently, a “compression-seeking” phase imposes anisotropic consolidation, selectively preserving variance along dominant eigendirections while contracting others, a transition marked with significant improvement in downstream task performance. We show these phases can emerge from a fundamental interplay of cross-entropy optimization under skewed token frequencies and representational bottlenecks (d ≪ |V|). Post-training further transforms geometry: SFT and DPO drive “entropy-seeking” dynamics to integrate specific instructional or preferential data, improving in-distribution performance while degrading out-of-distribution robustness. Conversely, RLVR induces “compression-seeking” , enhancing reward alignment but reducing generation diversity.
Cite
Text
Li et al. "Tracing the Representation Geometry of Language Models from Pretraining to Post-Training." Advances in Neural Information Processing Systems, 2025.Markdown
[Li et al. "Tracing the Representation Geometry of Language Models from Pretraining to Post-Training." Advances in Neural Information Processing Systems, 2025.](https://mlanthology.org/neurips/2025/li2025neurips-tracing/)BibTeX
@inproceedings{li2025neurips-tracing,
title = {{Tracing the Representation Geometry of Language Models from Pretraining to Post-Training}},
author = {Li, Melody Zixuan and Agrawal, Kumar Krishna and Ghosh, Arna and Teru, Komal Kumar and Santoro, Adam and Lajoie, Guillaume and Richards, Blake Aaron},
booktitle = {Advances in Neural Information Processing Systems},
year = {2025},
url = {https://mlanthology.org/neurips/2025/li2025neurips-tracing/}
}