Dey, Nolan Simran

6 publications

NeurIPS 2025 Don't Be Lazy: CompleteP Enables Compute-Efficient Deep Transformers Nolan Simran Dey, Bin Claire Zhang, Lorenzo Noci, Mufan Li, Blake Bordelon, Shane Bergsma, Cengiz Pehlevan, Boris Hanin, Joel Hestness
TMLR 2025 Neuron-Based Explanations of Neural Networks Sacrifice Completeness and Interpretability Nolan Simran Dey, Eric Taylor, Alexander Wong, Bryan P. Tripp, Graham W. Taylor
NeurIPS 2025 Power Lines: Scaling Laws for Weight Decay and Batch Size in LLM Pre-Training Shane Bergsma, Nolan Simran Dey, Gurpreet Gosal, Gavia Gray, Daria Soboleva, Joel Hestness
ICLR 2025 Straight to Zero: Why Linearly Decaying the Learning Rate to Zero Works Best for LLMs Shane Bergsma, Nolan Simran Dey, Gurpreet Gosal, Gavia Gray, Daria Soboleva, Joel Hestness
NeurIPSW 2024 Empirical Upper Bounds for Unstructured Sparsity in Compute-Efficient Language Modeling Esha Singh, Shane Bergsma, Nolan Simran Dey, Joel Hestness, Gavia Gray
NeurIPSW 2020 Identifying and Interpreting Tuning Dimensions in Deep Networks Nolan Simran Dey, Eric Taylor, Bryan P. Tripp, Alexander Wong, Graham W Taylor