Balles, Lukas

16 publications

ICLR 2025 Hierarchical Autoregressive Transformers: Combining Byte- and Word-Level Processing for Robust, Adaptable Language Models Pit Neitemeier, Björn Deiseroth, Constantin Eichenberg, Lukas Balles
ICLR 2025 U-$\mu$P: The Unit-Scaled Maximal Update Parametrization Charlie Blake, Constantin Eichenberg, Josef Dean, Lukas Balles, Luke Yuri Prince, Björn Deiseroth, Andres Felipe Cruz-Salinas, Carlo Luschi, Samuel Weinbach, Douglas Orr
TMLR 2024 On the Choice of Learning Rate for Local SGD Lukas Balles, Prabhu Teja S, Cedric Archambeau
NeurIPSW 2024 U-$\mu$P: The Unit-Scaled Maximal Update Parametrization Charlie Blake, Constantin Eichenberg, Josef Dean, Lukas Balles, Luke Yuri Prince, Björn Deiseroth, Andres Felipe Cruz-Salinas, Carlo Luschi, Samuel Weinbach, Douglas Orr
ICMLW 2024 U-μP: The Unit-Scaled Maximal Update Parametrization Charlie Blake, Constantin Eichenberg, Josef Dean, Lukas Balles, Luke Yuri Prince, Björn Deiseroth, Andres Felipe Cruz-Salinas, Carlo Luschi, Samuel Weinbach, Douglas Orr
ICMLW 2024 U-μP: The Unit-Scaled Maximal Update Parametrization Charlie Blake, Constantin Eichenberg, Josef Dean, Lukas Balles, Luke Yuri Prince, Björn Deiseroth, Andres Felipe Cruz-Salinas, Carlo Luschi, Samuel Weinbach, Douglas Orr
NeurIPSW 2023 A Negative Result on Gradient Matching for Selective Backprop Lukas Balles, Cedric Archambeau, Giovanni Zappella
NeurIPSW 2023 Continual Learning with Low Rank Adaptation Martin Wistuba, Prabhu Teja S, Lukas Balles, Giovanni Zappella
ICLR 2023 PASHA: Efficient HPO and NAS with Progressive Resource Allocation Ondrej Bohdal, Lukas Balles, Martin Wistuba, Beyza Ermis, Cedric Archambeau, Giovanni Zappella
NeurIPSW 2021 Gradient-Matching Coresets for Continual Learning Lukas Balles, Giovanni Zappella, Cedric Archambeau
NeurIPSW 2020 Self-Tuning Stochastic Optimization with Curvature-Aware Gradient Filtering Ricky T. Q. Chen, Dami Choi, Lukas Balles, David Duvenaud, Philipp Hennig
ICLR 2020 The Geometry of Sign Gradient Descent Lukas Balles, Fabian Pedregosa, Nicolas Le Roux
ICLR 2019 DeepOBS: A Deep Learning Optimizer Benchmark Suite Frank Schneider, Lukas Balles, Philipp Hennig
NeurIPS 2019 Limitations of the Empirical Fisher Approximation for Natural Gradient Descent Frederik Kunstner, Philipp Hennig, Lukas Balles
ICML 2018 Dissecting Adam: The Sign, Magnitude and Variance of Stochastic Gradients Lukas Balles, Philipp Hennig
UAI 2017 Coupling Adaptive Batch Sizes with Learning Rates Lukas Balles, Javier Romero, Philipp Hennig