Dremov, Aleksandr

1 publications

TMLR 2025 Training Dynamics of the Cooldown Stage in Warmup-Stable-Decay Learning Rate Scheduler Aleksandr Dremov, Alexander Hägele, Atli Kosson, Martin Jaggi