Predicting Training Re-Evaluation Curves Enables Effective Data Curriculums for LLMs

Bergsma, Shane; Dey, Nolan Simran; Hestness, Joel

Predicting Training Re-Evaluation Curves Enables Effective Data Curriculums for LLMs

Shane Bergsma, Nolan Simran Dey, Joel Hestness

ICLR 2026

/iclr/2026/bergsma2026iclr-predicting/

Abstract

Data curriculums have become central to successful LLM training, yet principles governing optimal data placement remain unclear. We introduce the *training re-evaluation curve (TREC)*, a diagnostic that retrospectively evaluates training batches *using the final model weights*. The TREC characterizes how well a trained model retains training data as a function of *when* the data was encountered during training. Analyzing TRECs for models from 111M to 3.9B parameters, we show that placing high-quality data at low points on the TREC significantly improves performance. Importantly, while a TREC is initially observable only after training, we demonstrate it can be *predicted in advance* from AdamW’s implicit EMA coefficients, enabling proactive curriculum design. By predicting TRECs for published training recipes, we explain prior ablations and reveal suboptimal data placements. We also align high-quality data with TREC minima in order to improve continual pre-training of a 3.9B-parameter LLM trained on 900B tokens.

PDF ICLR OpenReview Semantic Scholar

Cite

Text

Bergsma et al. "Predicting Training Re-Evaluation Curves Enables Effective Data Curriculums for LLMs." International Conference on Learning Representations, 2026.

Markdown

[Bergsma et al. "Predicting Training Re-Evaluation Curves Enables Effective Data Curriculums for LLMs." International Conference on Learning Representations, 2026.](https://mlanthology.org/iclr/2026/bergsma2026iclr-predicting/)

BibTeX

@inproceedings{bergsma2026iclr-predicting,
  title     = {{Predicting Training Re-Evaluation Curves Enables Effective Data Curriculums for LLMs}},
  author    = {Bergsma, Shane and Dey, Nolan Simran and Hestness, Joel},
  booktitle = {International Conference on Learning Representations},
  year      = {2026},
  url       = {https://mlanthology.org/iclr/2026/bergsma2026iclr-predicting/}
}