Hägele, Alexander

6 publications

TMLR 2025 Inverse Scaling in Test-Time Compute Aryo Pradipta Gema, Alexander Hägele, Runjin Chen, Andy Arditi, Jacob Goldman-Wetzler, Kit Fraser-Taliente, Henry Sleight, Linda Petrini, Julian Michael, Beatrice Alex, Pasquale Minervini, Yanda Chen, Joe Benton, Ethan Perez
ICML 2025 The Surprising Agreement Between Convex Optimization Theory and Learning-Rate Scheduling for Large Model Training Fabian Schaipp, Alexander Hägele, Adrien Taylor, Umut Simsekli, Francis Bach
TMLR 2025 Training Dynamics of the Cooldown Stage in Warmup-Stable-Decay Learning Rate Scheduler Aleksandr Dremov, Alexander Hägele, Atli Kosson, Martin Jaggi
NeurIPS 2024 Scaling Laws and Compute-Optimal Training Beyond Fixed Training Durations Alexander Hägele, Elie Bakouch, Atli Kosson, Loubna Ben Allal, Leandro Von Werra, Martin Jaggi
ICMLW 2024 Scaling Laws and Compute-Optimal Training Beyond Fixed Training Durations Alexander Hägele, Elie Bakouch, Atli Kosson, Loubna Ben Allal, Leandro Von Werra, Martin Jaggi
AISTATS 2023 BaCaDI: Bayesian Causal Discovery with Unknown Interventions Alexander Hägele, Jonas Rothfuss, Lars Lorch, Vignesh Ram Somnath, Bernhard Schölkopf, Andreas Krause