Dremov, Aleksandr

2 publications

ICLR 2026 Compute-Optimal Quantization-Aware Training Aleksandr Dremov, David Grangier, Angelos Katharopoulos, Awni Hannun
TMLR 2025 Training Dynamics of the Cooldown Stage in Warmup-Stable-Decay Learning Rate Scheduler Aleksandr Dremov, Alexander Hägele, Atli Kosson, Martin Jaggi