ML Anthology
Authors
Search
About
Soboleva, Daria
2 publications
NeurIPS
2025
Power Lines: Scaling Laws for Weight Decay and Batch Size in LLM Pre-Training
Shane Bergsma
,
Nolan Simran Dey
,
Gurpreet Gosal
,
Gavia Gray
,
Daria Soboleva
,
Joel Hestness
ICLR
2025
Straight to Zero: Why Linearly Decaying the Learning Rate to Zero Works Best for LLMs
Shane Bergsma
,
Nolan Simran Dey
,
Gurpreet Gosal
,
Gavia Gray
,
Daria Soboleva
,
Joel Hestness