ML Anthology
Authors
Search
About
Vyas, Nikhil
21 publications
ICLR
2025
A New Perspective on Shampoo's Preconditioner
Depen Morwani
,
Itai Shapira
,
Nikhil Vyas
,
Eran Malach
,
Sham M. Kakade
,
Lucas Janson
ICLR
2025
Deconstructing What Makes a Good Optimizer for Autoregressive Language Models
Rosie Zhao
,
Depen Morwani
,
David Brandfonbrener
,
Nikhil Vyas
,
Sham M. Kakade
ICLR
2025
How Does Critical Batch Size Scale in Pre-Training?
Hanlin Zhang
,
Depen Morwani
,
Nikhil Vyas
,
Jingfeng Wu
,
Difan Zou
,
Udaya Ghai
,
Dean Foster
,
Sham M. Kakade
TMLR
2025
Loss-to-Loss Prediction: Scaling Laws for All Datasets
David Brandfonbrener
,
Nikhil Anand
,
Nikhil Vyas
,
Eran Malach
,
Sham M. Kakade
ICLR
2025
Mixture of Parrots: Experts Improve Memorization More than Reasoning
Samy Jelassi
,
Clara Mohri
,
David Brandfonbrener
,
Alex Gu
,
Nikhil Vyas
,
Nikhil Anand
,
David Alvarez-Melis
,
Yuanzhi Li
,
Sham M. Kakade
,
Eran Malach
ICLR
2025
SOAP: Improving and Stabilizing Shampoo Using Adam for Language Modeling
Nikhil Vyas
,
Depen Morwani
,
Rosie Zhao
,
Itai Shapira
,
David Brandfonbrener
,
Lucas Janson
,
Sham M. Kakade
ICMLW
2024
AdaMeM: Memory Efficient Momentum for Adafactor
Nikhil Vyas
,
Depen Morwani
,
Sham M. Kakade
ICML
2024
Beyond Implicit Bias: The Insignificance of SGD Noise in Online Learning
Nikhil Vyas
,
Depen Morwani
,
Rosie Zhao
,
Gal Kaplun
,
Sham M. Kakade
,
Boaz Barak
NeurIPSW
2024
Connections Between Schedule-Free SGD, Accelerated SGD Variants, and Weight Averaging
Depen Morwani
,
Nikhil Vyas
,
Hanlin Zhang
,
Sham M. Kakade
NeurIPSW
2024
Deconstructing What Makes a Good Optimizer for Language Models
Rosie Zhao
,
Depen Morwani
,
David Brandfonbrener
,
Nikhil Vyas
,
Sham M. Kakade
ICML
2024
Distinguishing the Knowable from the Unknowable with Language Models
Gustaf Ahdritz
,
Tian Qin
,
Nikhil Vyas
,
Boaz Barak
,
Benjamin L. Edelman
NeurIPSW
2024
How Does Critical Batch Size Scale in Pre-Training?
Hanlin Zhang
,
Depen Morwani
,
Nikhil Vyas
,
Jingfeng Wu
,
Difan Zou
,
Udaya Ghai
,
Dean Foster
,
Sham M. Kakade
NeurIPSW
2024
Mixture of Parrots: Mixtures of Experts Improve Memorization More than Reasoning
Samy Jelassi
,
Clara Mohri
,
David Brandfonbrener
,
Alex Gu
,
Nikhil Vyas
,
Nikhil Anand
,
David Alvarez-Melis
,
Yuanzhi Li
,
Sham M. Kakade
,
Eran Malach
NeurIPSW
2024
SOAP: Improving and Stabilizing Shampoo Using Adam
Nikhil Vyas
,
Depen Morwani
,
Rosie Zhao
,
Itai Shapira
,
David Brandfonbrener
,
Lucas Janson
,
Sham M. Kakade
NeurIPSW
2024
Transformers Can Reinforcement Learn to Approximate Gittins Index
Vladimir Petrov
,
Nikhil Vyas
,
Lucas Janson
TMLR
2023
Empirical Limitations of the NTK for Understanding Scaling Laws in Deep Learning
Nikhil Vyas
,
Yamini Bansal
,
Preetum Nakkiran
NeurIPS
2023
Feature-Learning Networks Are Consistent Across Widths at Realistic Scales
Nikhil Vyas
,
Alexander Atanasov
,
Blake Bordelon
,
Depen Morwani
,
Sabarish Sainathan
,
Cengiz Pehlevan
ICML
2023
On Provable Copyright Protection for Generative Models
Nikhil Vyas
,
Sham M. Kakade
,
Boaz Barak
JAIR
2021
On Super Strong ETH
Nikhil Vyas
,
R. Ryan Williams
AAAI
2020
Results on a Super Strong Exponential Time Hypothesis
Nikhil Vyas
,
Ryan Williams
NeurIPS
2018
Thwarting Adversarial Examples: An $l_0$-Robust Sparse Fourier Transform
Mitali Bafna
,
Jack Murtagh
,
Nikhil Vyas