ML Anthology
Authors
Search
About
Balles, Lukas
16 publications
ICLR
2025
Hierarchical Autoregressive Transformers: Combining Byte- and Word-Level Processing for Robust, Adaptable Language Models
Pit Neitemeier
,
Björn Deiseroth
,
Constantin Eichenberg
,
Lukas Balles
ICLR
2025
U-$\mu$P: The Unit-Scaled Maximal Update Parametrization
Charlie Blake
,
Constantin Eichenberg
,
Josef Dean
,
Lukas Balles
,
Luke Yuri Prince
,
Björn Deiseroth
,
Andres Felipe Cruz-Salinas
,
Carlo Luschi
,
Samuel Weinbach
,
Douglas Orr
TMLR
2024
On the Choice of Learning Rate for Local SGD
Lukas Balles
,
Prabhu Teja S
,
Cedric Archambeau
NeurIPSW
2024
U-$\mu$P: The Unit-Scaled Maximal Update Parametrization
Charlie Blake
,
Constantin Eichenberg
,
Josef Dean
,
Lukas Balles
,
Luke Yuri Prince
,
Björn Deiseroth
,
Andres Felipe Cruz-Salinas
,
Carlo Luschi
,
Samuel Weinbach
,
Douglas Orr
ICMLW
2024
U-μP: The Unit-Scaled Maximal Update Parametrization
Charlie Blake
,
Constantin Eichenberg
,
Josef Dean
,
Lukas Balles
,
Luke Yuri Prince
,
Björn Deiseroth
,
Andres Felipe Cruz-Salinas
,
Carlo Luschi
,
Samuel Weinbach
,
Douglas Orr
ICMLW
2024
U-μP: The Unit-Scaled Maximal Update Parametrization
Charlie Blake
,
Constantin Eichenberg
,
Josef Dean
,
Lukas Balles
,
Luke Yuri Prince
,
Björn Deiseroth
,
Andres Felipe Cruz-Salinas
,
Carlo Luschi
,
Samuel Weinbach
,
Douglas Orr
NeurIPSW
2023
A Negative Result on Gradient Matching for Selective Backprop
Lukas Balles
,
Cedric Archambeau
,
Giovanni Zappella
NeurIPSW
2023
Continual Learning with Low Rank Adaptation
Martin Wistuba
,
Prabhu Teja S
,
Lukas Balles
,
Giovanni Zappella
ICLR
2023
PASHA: Efficient HPO and NAS with Progressive Resource Allocation
Ondrej Bohdal
,
Lukas Balles
,
Martin Wistuba
,
Beyza Ermis
,
Cedric Archambeau
,
Giovanni Zappella
NeurIPSW
2021
Gradient-Matching Coresets for Continual Learning
Lukas Balles
,
Giovanni Zappella
,
Cedric Archambeau
NeurIPSW
2020
Self-Tuning Stochastic Optimization with Curvature-Aware Gradient Filtering
Ricky T. Q. Chen
,
Dami Choi
,
Lukas Balles
,
David Duvenaud
,
Philipp Hennig
ICLR
2020
The Geometry of Sign Gradient Descent
Lukas Balles
,
Fabian Pedregosa
,
Nicolas Le Roux
ICLR
2019
DeepOBS: A Deep Learning Optimizer Benchmark Suite
Frank Schneider
,
Lukas Balles
,
Philipp Hennig
NeurIPS
2019
Limitations of the Empirical Fisher Approximation for Natural Gradient Descent
Frederik Kunstner
,
Philipp Hennig
,
Lukas Balles
ICML
2018
Dissecting Adam: The Sign, Magnitude and Variance of Stochastic Gradients
Lukas Balles
,
Philipp Hennig
UAI
2017
Coupling Adaptive Batch Sizes with Learning Rates
Lukas Balles
,
Javier Romero
,
Philipp Hennig