Vyas, Nikhil

21 publications

ICLR 2025 A New Perspective on Shampoo's Preconditioner Depen Morwani, Itai Shapira, Nikhil Vyas, Eran Malach, Sham M. Kakade, Lucas Janson
ICLR 2025 Deconstructing What Makes a Good Optimizer for Autoregressive Language Models Rosie Zhao, Depen Morwani, David Brandfonbrener, Nikhil Vyas, Sham M. Kakade
ICLR 2025 How Does Critical Batch Size Scale in Pre-Training? Hanlin Zhang, Depen Morwani, Nikhil Vyas, Jingfeng Wu, Difan Zou, Udaya Ghai, Dean Foster, Sham M. Kakade
TMLR 2025 Loss-to-Loss Prediction: Scaling Laws for All Datasets David Brandfonbrener, Nikhil Anand, Nikhil Vyas, Eran Malach, Sham M. Kakade
ICLR 2025 Mixture of Parrots: Experts Improve Memorization More than Reasoning Samy Jelassi, Clara Mohri, David Brandfonbrener, Alex Gu, Nikhil Vyas, Nikhil Anand, David Alvarez-Melis, Yuanzhi Li, Sham M. Kakade, Eran Malach
ICLR 2025 SOAP: Improving and Stabilizing Shampoo Using Adam for Language Modeling Nikhil Vyas, Depen Morwani, Rosie Zhao, Itai Shapira, David Brandfonbrener, Lucas Janson, Sham M. Kakade
ICMLW 2024 AdaMeM: Memory Efficient Momentum for Adafactor Nikhil Vyas, Depen Morwani, Sham M. Kakade
ICML 2024 Beyond Implicit Bias: The Insignificance of SGD Noise in Online Learning Nikhil Vyas, Depen Morwani, Rosie Zhao, Gal Kaplun, Sham M. Kakade, Boaz Barak
NeurIPSW 2024 Connections Between Schedule-Free SGD, Accelerated SGD Variants, and Weight Averaging Depen Morwani, Nikhil Vyas, Hanlin Zhang, Sham M. Kakade
NeurIPSW 2024 Deconstructing What Makes a Good Optimizer for Language Models Rosie Zhao, Depen Morwani, David Brandfonbrener, Nikhil Vyas, Sham M. Kakade
ICML 2024 Distinguishing the Knowable from the Unknowable with Language Models Gustaf Ahdritz, Tian Qin, Nikhil Vyas, Boaz Barak, Benjamin L. Edelman
NeurIPSW 2024 How Does Critical Batch Size Scale in Pre-Training? Hanlin Zhang, Depen Morwani, Nikhil Vyas, Jingfeng Wu, Difan Zou, Udaya Ghai, Dean Foster, Sham M. Kakade
NeurIPSW 2024 Mixture of Parrots: Mixtures of Experts Improve Memorization More than Reasoning Samy Jelassi, Clara Mohri, David Brandfonbrener, Alex Gu, Nikhil Vyas, Nikhil Anand, David Alvarez-Melis, Yuanzhi Li, Sham M. Kakade, Eran Malach
NeurIPSW 2024 SOAP: Improving and Stabilizing Shampoo Using Adam Nikhil Vyas, Depen Morwani, Rosie Zhao, Itai Shapira, David Brandfonbrener, Lucas Janson, Sham M. Kakade
NeurIPSW 2024 Transformers Can Reinforcement Learn to Approximate Gittins Index Vladimir Petrov, Nikhil Vyas, Lucas Janson
TMLR 2023 Empirical Limitations of the NTK for Understanding Scaling Laws in Deep Learning Nikhil Vyas, Yamini Bansal, Preetum Nakkiran
NeurIPS 2023 Feature-Learning Networks Are Consistent Across Widths at Realistic Scales Nikhil Vyas, Alexander Atanasov, Blake Bordelon, Depen Morwani, Sabarish Sainathan, Cengiz Pehlevan
ICML 2023 On Provable Copyright Protection for Generative Models Nikhil Vyas, Sham M. Kakade, Boaz Barak
JAIR 2021 On Super Strong ETH Nikhil Vyas, R. Ryan Williams
AAAI 2020 Results on a Super Strong Exponential Time Hypothesis Nikhil Vyas, Ryan Williams
NeurIPS 2018 Thwarting Adversarial Examples: An $l_0$-Robust Sparse Fourier Transform Mitali Bafna, Jack Murtagh, Nikhil Vyas