ML Anthology
Authors
Search
About
Luschi, Carlo
14 publications
ICLR
2025
U-$\mu$P: The Unit-Scaled Maximal Update Parametrization
Charlie Blake
,
Constantin Eichenberg
,
Josef Dean
,
Lukas Balles
,
Luke Yuri Prince
,
Björn Deiseroth
,
Andres Felipe Cruz-Salinas
,
Carlo Luschi
,
Samuel Weinbach
,
Douglas Orr
ICMLW
2024
Scalify: Scale Propagation for Efficient Low-Precision LLM Training
Paul Balanca
,
Samuel Hosegood
,
Carlo Luschi
,
Andrew W Fitzgibbon
ICML
2024
SparQ Attention: Bandwidth-Efficient LLM Inference
Luka Ribar
,
Ivan Chelombiev
,
Luke Hudlass-Galley
,
Charlie Blake
,
Carlo Luschi
,
Douglas Orr
ICLRW
2024
SparQ Attention: Bandwidth-Efficient LLM Inference
Luka Ribar
,
Ivan Chelombiev
,
Luke Hudlass-Galley
,
Charlie Blake
,
Carlo Luschi
,
Douglas Orr
ICMLW
2024
Towards Linking Graph Topology to Model Performance for Biomedical Knowledge Graph Completion
Alberto Cattaneo
,
Thomas Martynec
,
Stephen Bonner
,
Carlo Luschi
,
Daniel Justus
NeurIPSW
2024
U-$\mu$P: The Unit-Scaled Maximal Update Parametrization
Charlie Blake
,
Constantin Eichenberg
,
Josef Dean
,
Lukas Balles
,
Luke Yuri Prince
,
Björn Deiseroth
,
Andres Felipe Cruz-Salinas
,
Carlo Luschi
,
Samuel Weinbach
,
Douglas Orr
ICMLW
2024
U-μP: The Unit-Scaled Maximal Update Parametrization
Charlie Blake
,
Constantin Eichenberg
,
Josef Dean
,
Lukas Balles
,
Luke Yuri Prince
,
Björn Deiseroth
,
Andres Felipe Cruz-Salinas
,
Carlo Luschi
,
Samuel Weinbach
,
Douglas Orr
ICMLW
2024
U-μP: The Unit-Scaled Maximal Update Parametrization
Charlie Blake
,
Constantin Eichenberg
,
Josef Dean
,
Lukas Balles
,
Luke Yuri Prince
,
Björn Deiseroth
,
Andres Felipe Cruz-Salinas
,
Carlo Luschi
,
Samuel Weinbach
,
Douglas Orr
NeurIPS
2023
Generating QM1B with PySCF$_{\text{IPU}}$
Alexander Mathiasen
,
Hatem Helal
,
Kerstin Klaser
,
Paul Balanca
,
Josef Dean
,
Carlo Luschi
,
Dominique Beaini
,
Andrew W. Fitzgibbon
,
Dominic Masters
ICMLW
2023
Repurposing Density Functional Theory to Suit Deep Learning
Alexander Mathiasen
,
Hatem Helal
,
Paul Balanca
,
Kerstin Klaeser
,
Josef Dean
,
Carlo Luschi
,
Dominique Beaini
,
Andrew W Fitzgibbon
,
Dominic Masters
NeurIPSW
2023
Training and Inference of Large Language Models Using 8-Bit Floating Point
Sergio P. Perez
,
Yan Zhang
,
James Briggs
,
Charlie Blake
,
Josh Levy-Kramer
,
Paul Balanca
,
Carlo Luschi
,
Stephen Barlow
,
Andrew W Fitzgibbon
ICML
2023
Unit Scaling: Out-of-the-Box Low-Precision Training
Charlie Blake
,
Douglas Orr
,
Carlo Luschi
NeurIPS
2021
Proxy-Normalizing Activations to Match Batch Normalization While Removing Batch Dependence
Antoine Labatie
,
Dominic Masters
,
Zach Eaton-Rosen
,
Carlo Luschi
NeurIPS
2020
Improving Neural Network Training in Low Dimensional Random Bases
Frithjof Gressmann
,
Zach Eaton-Rosen
,
Carlo Luschi