Doshi, Darshil

6 publications

ICML 2025 (How) Can Transformers Predict Pseudo-Random Numbers? Tao Tao, Darshil Doshi, Dayal Singh Kalra, Tianyu He, Maissam Barkeshli
NeurIPSW 2024 Exploring Model Depth and Data Complexity Through the Lens of Cellular Automata Tianyu He, Darshil Doshi, Aritra Das, Andrey Gromov
NeurIPS 2024 Learning to Grok: Emergence of In-Context Learning and Skill Composition in Modular Arithmetic Tasks Tianyu He, Darshil Doshi, Aritra Das, Andrey Gromov
ICMLW 2024 Learning to Grok: Emergence of In-Context Learning and Skill Composition in Modular Arithmetic Tasks Tianyu He, Darshil Doshi, Aritra Das, Andrey Gromov
ICLR 2024 To Grok or Not to Grok: Disentangling Generalization and Memorization on Corrupted Algorithmic Datasets Darshil Doshi, Aritra Das, Tianyu He, Andrey Gromov
NeurIPS 2023 Critical Initialization of Wide and Deep Neural Networks Using Partial Jacobians: General Theory and Applications Darshil Doshi, Tianyu He, Andrey Gromov