Gromov, Andrey

16 publications

ICML 2025 PARQ: Piecewise-Affine Regularized Quantization Lisa Jin, Jianhao Ma, Zechun Liu, Andrey Gromov, Aaron Defazio, Lin Xiao
ICLRW 2025 Spectral Journey: How Transformers Predict the Shortest Path Andrew Cohen, Andrey Gromov, Kaiyu Yang, Yuandong Tian
ICLR 2025 The Unreasonable Ineffectiveness of the Deeper Layers Andrey Gromov, Kushal Tirumala, Hassan Shapourian, Paolo Glorioso, Dan Roberts
NeurIPSW 2024 Exploring Model Depth and Data Complexity Through the Lens of Cellular Automata Tianyu He, Darshil Doshi, Aritra Das, Andrey Gromov
ICMLW 2024 Is Model Collapse Inevitable? Breaking the Curse of Recursion by Accumulating Real and Synthetic Data Matthias Gerstgrasser, Rylan Schaeffer, Apratim Dey, Rafael Rafailov, Tomasz Korbak, Henry Sleight, Rajashree Agrawal, John Hughes, Dhruv Bhandarkar Pai, Andrey Gromov, Dan Roberts, Diyi Yang, David L. Donoho, Sanmi Koyejo
NeurIPS 2024 Learning to Grok: Emergence of In-Context Learning and Skill Composition in Modular Arithmetic Tasks Tianyu He, Darshil Doshi, Aritra Das, Andrey Gromov
ICMLW 2024 Learning to Grok: Emergence of In-Context Learning and Skill Composition in Modular Arithmetic Tasks Tianyu He, Darshil Doshi, Aritra Das, Andrey Gromov
NeurIPSW 2024 The Unreasonable Ineffectiveness of the Deeper Layers Andrey Gromov, Kushal Tirumala, Hassan Shapourian, Paolo Glorioso, Dan Roberts
ICLR 2024 To Grok or Not to Grok: Disentangling Generalization and Memorization on Corrupted Algorithmic Datasets Darshil Doshi, Aritra Das, Tianyu He, Andrey Gromov
ICLRW 2024 Towards an Improved Understanding and Utilization of Maximum Manifold Capacity Representations Rylan Schaeffer, Berivan Isik, Dhruv Bhandarkar Pai, Andres Carranza, Victor Lecomte, Alyssa Unell, Mikail Khona, Thomas Edward Yerxa, Yann LeCun, SueYeon Chung, Andrey Gromov, Ravid Shwartz-Ziv, Sanmi Koyejo
NeurIPSW 2023 An Information-Theoretic Understanding of Maximum Manifold Capacity Representations Rylan Schaeffer, Berivan Isik, Victor Lecomte, Mikail Khona, Yann LeCun, Andrey Gromov, Ravid Shwartz-Ziv, Sanmi Koyejo
NeurIPSW 2023 An Information-Theoretic Understanding of Maximum Manifold Capacity Representations Victor Lecomte, Rylan Schaeffer, Berivan Isik, Mikail Khona, Yann LeCun, Sanmi Koyejo, Andrey Gromov, Ravid Shwartz-Ziv
NeurIPSW 2023 An Information-Theoretic Understanding of Maximum Manifold Capacity Representations Berivan Isik, Victor Lecomte, Rylan Schaeffer, Yann LeCun, Mikail Khona, Ravid Shwartz-Ziv, Sanmi Koyejo, Andrey Gromov
NeurIPSW 2023 Associative Memory Under the Probabilistic Lens: Improved Transformers & Dynamic Memory Creation Rylan Schaeffer, Mikail Khona, Nika Zahedi, Ila R Fiete, Andrey Gromov, Sanmi Koyejo
NeurIPS 2023 Critical Initialization of Wide and Deep Neural Networks Using Partial Jacobians: General Theory and Applications Darshil Doshi, Tianyu He, Andrey Gromov
NeurIPSW 2023 Divergence at the Interpolation Threshold: Identifying, Interpreting & Ablating the Sources of a Deep Learning Puzzle Rylan Schaeffer, Zachary Robertson, Akhilan Boopathy, Mikail Khona, Ila Fiete, Andrey Gromov, Sanmi Koyejo