ML Anthology
Authors
Search
About
Gray, Gavia
6 publications
NeurIPS
2025
Power Lines: Scaling Laws for Weight Decay and Batch Size in LLM Pre-Training
Shane Bergsma
,
Nolan Simran Dey
,
Gurpreet Gosal
,
Gavia Gray
,
Daria Soboleva
,
Joel Hestness
ICLR
2025
Straight to Zero: Why Linearly Decaying the Learning Rate to Zero Works Best for LLMs
Shane Bergsma
,
Nolan Simran Dey
,
Gurpreet Gosal
,
Gavia Gray
,
Daria Soboleva
,
Joel Hestness
NeurIPSW
2024
Empirical Upper Bounds for Unstructured Sparsity in Compute-Efficient Language Modeling
Esha Singh
,
Shane Bergsma
,
Nolan Simran Dey
,
Joel Hestness
,
Gavia Gray
NeurIPS
2024
Normalization Layer Per-Example Gradients Are Sufficient to Predict Gradient Noise Scale in Transformers
Gavia Gray
,
Aman Tiwari
,
Shane Bergsma
,
Joel Hestness
NeurIPSW
2023
Efficient and Approximate Per-Example Gradient Norms for Gradient Noise Scale
Gavia Gray
,
Anshul Samar
,
Joel Hestness
NeurIPSW
2023
Transferring Movement Understanding for Parkinson’s Therapy by Generative Pre-Training
Emily Napier
,
Gavia Gray
,
Tristan Loria
,
Veronica Vuong
,
Michael Thaut
,
Sageev Oore