Dey, Nolan Simran

6 publications

NeurIPS 2025 Don't Be Lazy: CompleteP Enables Compute-Efficient Deep Transformers Nolan Simran Dey, Bin Claire Zhang, Lorenzo Noci, Mufan Li, Blake Bordelon, Shane Bergsma, Cengiz Pehlevan, Boris Hanin, Joel Hestness

TMLR 2025 Neuron-Based Explanations of Neural Networks Sacrifice Completeness and Interpretability Nolan Simran Dey, Eric Taylor, Alexander Wong, Bryan P. Tripp, Graham W. Taylor

NeurIPS 2025 Power Lines: Scaling Laws for Weight Decay and Batch Size in LLM Pre-Training Shane Bergsma, Nolan Simran Dey, Gurpreet Gosal, Gavia Gray, Daria Soboleva, Joel Hestness

ICLR 2025 Straight to Zero: Why Linearly Decaying the Learning Rate to Zero Works Best for LLMs Shane Bergsma, Nolan Simran Dey, Gurpreet Gosal, Gavia Gray, Daria Soboleva, Joel Hestness

NeurIPSW 2024 Empirical Upper Bounds for Unstructured Sparsity in Compute-Efficient Language Modeling Esha Singh, Shane Bergsma, Nolan Simran Dey, Joel Hestness, Gavia Gray

NeurIPSW 2020 Identifying and Interpreting Tuning Dimensions in Deep Networks Nolan Simran Dey, Eric Taylor, Bryan P. Tripp, Alexander Wong, Graham W Taylor