Luschi, Carlo

14 publications

ICLR 2025 U-$\mu$P: The Unit-Scaled Maximal Update Parametrization Charlie Blake, Constantin Eichenberg, Josef Dean, Lukas Balles, Luke Yuri Prince, Björn Deiseroth, Andres Felipe Cruz-Salinas, Carlo Luschi, Samuel Weinbach, Douglas Orr
ICMLW 2024 Scalify: Scale Propagation for Efficient Low-Precision LLM Training Paul Balanca, Samuel Hosegood, Carlo Luschi, Andrew W Fitzgibbon
ICML 2024 SparQ Attention: Bandwidth-Efficient LLM Inference Luka Ribar, Ivan Chelombiev, Luke Hudlass-Galley, Charlie Blake, Carlo Luschi, Douglas Orr
ICLRW 2024 SparQ Attention: Bandwidth-Efficient LLM Inference Luka Ribar, Ivan Chelombiev, Luke Hudlass-Galley, Charlie Blake, Carlo Luschi, Douglas Orr
ICMLW 2024 Towards Linking Graph Topology to Model Performance for Biomedical Knowledge Graph Completion Alberto Cattaneo, Thomas Martynec, Stephen Bonner, Carlo Luschi, Daniel Justus
NeurIPSW 2024 U-$\mu$P: The Unit-Scaled Maximal Update Parametrization Charlie Blake, Constantin Eichenberg, Josef Dean, Lukas Balles, Luke Yuri Prince, Björn Deiseroth, Andres Felipe Cruz-Salinas, Carlo Luschi, Samuel Weinbach, Douglas Orr
ICMLW 2024 U-μP: The Unit-Scaled Maximal Update Parametrization Charlie Blake, Constantin Eichenberg, Josef Dean, Lukas Balles, Luke Yuri Prince, Björn Deiseroth, Andres Felipe Cruz-Salinas, Carlo Luschi, Samuel Weinbach, Douglas Orr
ICMLW 2024 U-μP: The Unit-Scaled Maximal Update Parametrization Charlie Blake, Constantin Eichenberg, Josef Dean, Lukas Balles, Luke Yuri Prince, Björn Deiseroth, Andres Felipe Cruz-Salinas, Carlo Luschi, Samuel Weinbach, Douglas Orr
NeurIPS 2023 Generating QM1B with PySCF$_{\text{IPU}}$ Alexander Mathiasen, Hatem Helal, Kerstin Klaser, Paul Balanca, Josef Dean, Carlo Luschi, Dominique Beaini, Andrew W. Fitzgibbon, Dominic Masters
ICMLW 2023 Repurposing Density Functional Theory to Suit Deep Learning Alexander Mathiasen, Hatem Helal, Paul Balanca, Kerstin Klaeser, Josef Dean, Carlo Luschi, Dominique Beaini, Andrew W Fitzgibbon, Dominic Masters
NeurIPSW 2023 Training and Inference of Large Language Models Using 8-Bit Floating Point Sergio P. Perez, Yan Zhang, James Briggs, Charlie Blake, Josh Levy-Kramer, Paul Balanca, Carlo Luschi, Stephen Barlow, Andrew W Fitzgibbon
ICML 2023 Unit Scaling: Out-of-the-Box Low-Precision Training Charlie Blake, Douglas Orr, Carlo Luschi
NeurIPS 2021 Proxy-Normalizing Activations to Match Batch Normalization While Removing Batch Dependence Antoine Labatie, Dominic Masters, Zach Eaton-Rosen, Carlo Luschi
NeurIPS 2020 Improving Neural Network Training in Low Dimensional Random Bases Frithjof Gressmann, Zach Eaton-Rosen, Carlo Luschi