Arora, Sanjeev

89 publications

ICML 2025 Generalizing from SIMPLE to HARD Visual Reasoning: Can We Mitigate Modality Imbalance in VLMs? Simon Park, Abhishek Panigrahi, Yun Cheng, Dingli Yu, Anirudh Goyal, Sanjeev Arora
ICLRW 2025 Generalizing from SIMPLE to HARD Visual Reasoning: Can We Mitigate Modality Imbalance in VLMs? Simon Park, Abhishek Panigrahi, Yun Cheng, Dingli Yu, Anirudh Goyal, Sanjeev Arora
NeurIPS 2025 Ineq-Comp: Benchmarking Human-Intuitive Compositional Reasoning in Automated Theorem Proving of Inequalities Haoyu Zhao, Yihan Geng, Shange Tang, Yong Lin, Bohan Lyu, Hongzhou Lin, Chi Jin, Sanjeev Arora
ICLR 2025 Instruct-SkillMix: A Powerful Pipeline for LLM Instruction Tuning Simran Kaur, Simon Park, Anirudh Goyal, Sanjeev Arora
NeurIPS 2025 LiveCodeBench Pro: How Do Olympiad Medalists Judge LLMs in Competitive Programming? Zihan Zheng, Zerui Cheng, Zeyu Shen, Shang Zhou, Kaiyuan Liu, Hansen He, Dongruixuan Li, Stanley Wei, Hangyi Hao, Jianzhu Yao, Peiyao Sheng, Zixuan Wang, Wenhao Chai, Aleksandra Korolova, Peter Henderson, Sanjeev Arora, Pramod Viswanath, Jingbo Shang, Saining Xie
TMLR 2025 Low Compute Unlearning via Sparse Representations Vedant Shah, Frederik Träuble, Ashish Malik, Hugo Larochelle, Michael Curtis Mozer, Sanjeev Arora, Yoshua Bengio, Anirudh Goyal
ICML 2025 On the Power of Context-Enhanced Learning in LLMs Xingyu Zhu, Abhishek Panigrahi, Sanjeev Arora
ICLRW 2025 On the Power of Context-Enhanced Learning in LLMs Xingyu Zhu, Abhishek Panigrahi, Sanjeev Arora
ICLR 2025 Provable Unlearning in Topic Modeling and Downstream Tasks Stanley Wei, Sadhika Malladi, Sanjeev Arora, Amartya Sanyal
ICLR 2025 Unintentional Unalignment: Likelihood Displacement in Direct Preference Optimization Noam Razin, Sadhika Malladi, Adithya Bhaskar, Danqi Chen, Sanjeev Arora, Boris Hanin
ICML 2025 Weak-to-Strong Generalization Even in Random Feature Networks, Provably Marko Medvedev, Kaifeng Lyu, Dingli Yu, Sanjeev Arora, Zhiyuan Li, Nathan Srebro
NeurIPS 2025 What Makes a Reward Model a Good Teacher? an Optimization Perspective Noam Razin, Zixuan Wang, Hubert Strauss, Stanley Wei, Jason D. Lee, Sanjeev Arora
ICLR 2024 A Quadratic Synchronization Rule for Distributed Deep Learning Xinran Gu, Kaifeng Lyu, Sanjeev Arora, Jingzhao Zhang, Longbo Huang
NeurIPSW 2024 AI-Assisted Generation of Difficult Math Questions Vedant Shah, Dingli Yu, Kaifeng Lyu, Simon Park, Jiatong Yu, Yinghui He, Nan Rosemary Ke, Michael Curtis Mozer, Yoshua Bengio, Sanjeev Arora, Anirudh Goyal
NeurIPS 2024 Can Models Learn Skill Composition from Examples? Haoyu Zhao, Simran Kaur, Dingli Yu, Anirudh Goyal, Sanjeev Arora
ICMLW 2024 Can Models Learn Skill Composition from Examples? Haoyu Zhao, Simran Kaur, Dingli Yu, Anirudh Goyal, Sanjeev Arora
NeurIPSW 2024 Can Models Learn Skill Composition from Examples? Haoyu Zhao, Simran Kaur, Dingli Yu, Anirudh Goyal, Sanjeev Arora
NeurIPS 2024 CharXiv: Charting Gaps in Realistic Chart Understanding in Multimodal LLMs Zirui Wang, Mengzhou Xia, Luxi He, Howard Chen, Yitao Liu, Richard Zhu, Kaiqu Liang, Xindi Wu, Haotian Liu, Sadhika Malladi, Alexis Chevalier, Sanjeev Arora, Danqi Chen
NeurIPS 2024 ConceptMix: A Compositional Image Generation Benchmark with Controllable Difficulty Xindi Wu, Dingli Yu, Yangsibo Huang, Olga Russakovsky, Sanjeev Arora
NeurIPSW 2024 ConceptMix: A Compositional Image Generation Benchmark with Controllable Difficulty Xindi Wu, Dingli Yu, Yangsibo Huang, Olga Russakovsky, Sanjeev Arora
NeurIPSW 2024 Instruct-SkillMix: A Powerful Pipeline for LLM Instruction Tuning Simran Kaur, Simon Park, Anirudh Goyal, Sanjeev Arora
NeurIPSW 2024 Instruct-SkillMix: A Powerful Pipeline for LLM Instruction Tuning Simran Kaur, Simon Park, Anirudh Goyal, Sanjeev Arora
NeurIPS 2024 Keeping LLMs Aligned After Fine-Tuning: The Crucial Role of Prompt Templates Kaifeng Lyu, Haoyu Zhao, Xinran Gu, Dingli Yu, Anirudh Goyal, Sanjeev Arora
ICLRW 2024 Keeping LLMs Aligned After Fine-Tuning: The Crucial Role of Prompt Templates Kaifeng Lyu, Haoyu Zhao, Xinran Gu, Dingli Yu, Anirudh Goyal, Sanjeev Arora
ICML 2024 LESS: Selecting Influential Data for Targeted Instruction Tuning Mengzhou Xia, Sadhika Malladi, Suchin Gururangan, Sanjeev Arora, Danqi Chen
ICLRW 2024 LESS: Selecting Influential Data for Targeted Instruction Tuning Mengzhou Xia, Sadhika Malladi, Suchin Gururangan, Sanjeev Arora, Danqi Chen
ICML 2024 Language Models as Science Tutors Alexis Chevalier, Jiayi Geng, Alexander Wettig, Howard Chen, Sebastian Mizera, Toni Annala, Max Aragon, Arturo Rodriguez Fanlo, Simon Frieder, Simon Machado, Akshara Prabhakar, Ellie Thieu, Jiachen T. Wang, Zirui Wang, Xindi Wu, Mengzhou Xia, Wenhan Xia, Jiatong Yu, Junjie Zhu, Zhiyong Ren, Sanjeev Arora, Danqi Chen
NeurIPS 2024 Metacognitive Capabilities of LLMs: An Exploration in Mathematical Problem Solving Aniket Didolkar, Anirudh Goyal, Nan Rosemary Ke, Siyuan Guo, Michal Valko, Timothy Lillicrap, Danilo Rezende, Yoshua Bengio, Michael Mozer, Sanjeev Arora
ICMLW 2024 Metacognitive Capabilities of LLMs: An Exploration in Mathematical Problem Solving Aniket Rajiv Didolkar, Anirudh Goyal, Nan Rosemary Ke, Siyuan Guo, Michal Valko, Timothy P Lillicrap, Danilo Jimenez Rezende, Yoshua Bengio, Michael Curtis Mozer, Sanjeev Arora
NeurIPSW 2024 Provable Unlearning in Topic Modeling and Downstream Tasks Stanley Wei, Sadhika Malladi, Sanjeev Arora, Amartya Sanyal
ICLR 2024 SKILL-MIX: A Flexible and Expandable Family of Evaluations for AI Models Dingli Yu, Simran Kaur, Arushi Gupta, Jonah Brown-Cohen, Anirudh Goyal, Sanjeev Arora
ICML 2024 Trainable Transformer in Transformer Abhishek Panigrahi, Sadhika Malladi, Mengzhou Xia, Sanjeev Arora
NeurIPSW 2024 Unintentional Unalignment: Likelihood Displacement in Direct Preference Optimization Noam Razin, Sadhika Malladi, Adithya Bhaskar, Danqi Chen, Sanjeev Arora, Boris Hanin
NeurIPSW 2024 Unintentional Unalignment: Likelihood Displacement in Direct Preference Optimization Noam Razin, Sadhika Malladi, Adithya Bhaskar, Danqi Chen, Sanjeev Arora, Boris Hanin
ICML 2023 A Kernel-Based View of Language Model Fine-Tuning Sadhika Malladi, Alexander Wettig, Dingli Yu, Danqi Chen, Sanjeev Arora
ICLRW 2023 A Kernel-Based View of Language Model Fine-Tuning Sadhika Malladi, Alexander Wettig, Dingli Yu, Danqi Chen, Sanjeev Arora
NeurIPSW 2023 A Quadratic Synchronization Rule for Distributed Deep Learning Xinran Gu, Kaifeng Lyu, Sanjeev Arora, Jingzhao Zhang, Longbo Huang
NeurIPSW 2023 Do Transformers Parse While Predicting the Masked Word? Haoyu Zhao, Abhishek Panigrahi, Rong Ge, Sanjeev Arora
NeurIPS 2023 Fine-Tuning Language Models with Just Forward Passes Sadhika Malladi, Tianyu Gao, Eshaan Nichani, Alex Damian, Jason Lee, Danqi Chen, Sanjeev Arora
ICMLW 2023 Fine-Tuning Language Models with Just Forward Passes Sadhika Malladi, Tianyu Gao, Eshaan Nichani, Jason D. Lee, Danqi Chen, Sanjeev Arora
ICMLW 2023 Fine-Tuning Language Models with Just Forward Passes Sadhika Malladi, Tianyu Gao, Eshaan Nichani, Alex Damian, Jason D. Lee, Danqi Chen, Sanjeev Arora
NeurIPSW 2023 Skill-Mix: A Flexible and Expandable Family of Evaluations for AI Models Dingli Yu, Simran Kaur, Arushi Gupta, Jonah Brown-Cohen, Anirudh Goyal, Sanjeev Arora
ICML 2023 Task-Specific Skill Localization in Fine-Tuned Language Models Abhishek Panigrahi, Nikunj Saunshi, Haoyu Zhao, Sanjeev Arora
NeurIPSW 2023 Trainable Transformer in Transformer Abhishek Panigrahi, Sadhika Malladi, Mengzhou Xia, Sanjeev Arora
ICLR 2023 Understanding Influence Functions and Datamodels via Harmonic Analysis Nikunj Saunshi, Arushi Gupta, Mark Braverman, Sanjeev Arora
ICLR 2023 Why (and When) Does Local SGD Generalize Better than SGD? Xinran Gu, Kaifeng Lyu, Longbo Huang, Sanjeev Arora
NeurIPS 2022 Implicit Bias of Gradient Descent on Reparametrized Models: On Equivalence to Mirror Descent Zhiyuan Li, Tianhao Wang, Jason Lee, Sanjeev Arora
NeurIPS 2022 New Definitions and Evaluations for Saliency Methods: Staying Intrinsic, Complete and Sound Arushi Gupta, Nikunj Saunshi, Dingli Yu, Kaifeng Lyu, Sanjeev Arora
ICLR 2022 On Predicting Generalization Using GANs Yi Zhang, Arushi Gupta, Nikunj Saunshi, Sanjeev Arora
NeurIPS 2022 On the SDEs and Scaling Rules for Adaptive Gradient Algorithms Sadhika Malladi, Kaifeng Lyu, Abhishek Panigrahi, Sanjeev Arora
ICML 2022 Understanding Contrastive Learning Requires Incorporating Inductive Biases Nikunj Saunshi, Jordan Ash, Surbhi Goel, Dipendra Misra, Cyril Zhang, Sanjeev Arora, Sham Kakade, Akshay Krishnamurthy
ICML 2022 Understanding Gradient Descent on the Edge of Stability in Deep Learning Sanjeev Arora, Zhiyuan Li, Abhishek Panigrahi
NeurIPS 2022 Understanding the Generalization Benefit of Normalization Layers: Sharpness Reduction Kaifeng Lyu, Zhiyuan Li, Sanjeev Arora
ICLR 2022 What Happens After SGD Reaches Zero Loss? --a Mathematical Framework Zhiyuan Li, Tianhao Wang, Sanjeev Arora
NeurIPSW 2022 Why (and When) Does Local SGD Generalize Better than SGD? Xinran Gu, Kaifeng Lyu, Longbo Huang, Sanjeev Arora
ICLR 2021 A Mathematical Exploration of Why Language Models Help Solve Downstream Tasks Nikunj Saunshi, Sadhika Malladi, Sanjeev Arora
NeurIPS 2021 Evaluating Gradient Inversion Attacks and Defenses in Federated Learning Yangsibo Huang, Samyak Gupta, Zhao Song, Kai Li, Sanjeev Arora
NeurIPS 2021 Gradient Descent on Two-Layer Nets: Margin Maximization and Simplicity Bias Kaifeng Lyu, Zhiyuan Li, Runzhe Wang, Sanjeev Arora
NeurIPS 2021 On the Validity of Modeling SGD with Stochastic Differential Equations (SDEs) Zhiyuan Li, Sadhika Malladi, Sanjeev Arora
ICLR 2021 Why Are Convolutional Nets More Sample-Efficient than Fully-Connected Nets? Zhiyuan Li, Yi Zhang, Sanjeev Arora
ICML 2020 A Sample Complexity Separation Between Non-Convex and Convex Meta-Learning Nikunj Saunshi, Yi Zhang, Mikhail Khodak, Sanjeev Arora
ICLR 2020 An Exponential Learning Rate Schedule for Deep Learning Zhiyuan Li, Sanjeev Arora
ICLR 2020 Harnessing the Power of Infinitely Wide Deep Nets on Small-Data Tasks Sanjeev Arora, Simon S. Du, Zhiyuan Li, Ruslan Salakhutdinov, Ruosong Wang, Dingli Yu
ICML 2020 InstaHide: Instance-Hiding Schemes for Private Distributed Learning Yangsibo Huang, Zhao Song, Kai Li, Sanjeev Arora
NeurIPS 2020 Over-Parameterized Adversarial Training: An Analysis Overcoming the Curse of Dimensionality Yi Zhang, Orestis Plevrakis, Simon S Du, Xingguo Li, Zhao Song, Sanjeev Arora
ICML 2020 Provable Representation Learning for Imitation Learning via Bi-Level Optimization Sanjeev Arora, Simon Du, Sham Kakade, Yuping Luo, Nikunj Saunshi
NeurIPS 2020 Reconciling Modern Deep Learning with Traditional Optimization Analyses: The Intrinsic Learning Rate Zhiyuan Li, Kaifeng Lyu, Sanjeev Arora
ICLR 2019 A Convergence Analysis of Gradient Descent for Deep Linear Neural Networks Sanjeev Arora, Nadav Cohen, Noah Golowich, Wei Hu
ICML 2019 A Theoretical Analysis of Contrastive Unsupervised Representation Learning Nikunj Saunshi, Orestis Plevrakis, Sanjeev Arora, Mikhail Khodak, Hrishikesh Khandeparkar
NeurIPS 2019 Explaining Landscape Connectivity of Low-Cost Solutions for Multilayer Nets Rohith Kuditipudi, Xiang Wang, Holden Lee, Yi Zhang, Zhiyuan Li, Wei Hu, Rong Ge, Sanjeev Arora
ICML 2019 Fine-Grained Analysis of Optimization and Generalization for Overparameterized Two-Layer Neural Networks Sanjeev Arora, Simon Du, Wei Hu, Zhiyuan Li, Ruosong Wang
NeurIPS 2019 Implicit Regularization in Deep Matrix Factorization Sanjeev Arora, Nadav Cohen, Wei Hu, Yuping Luo
NeurIPS 2019 On Exact Computation with an Infinitely Wide Neural Net Sanjeev Arora, Simon S Du, Wei Hu, Zhiyuan Li, Ruslan Salakhutdinov, Ruosong Wang
ICLR 2019 Theoretical Analysis of Auto Rate-Tuning by Batch Normalization Sanjeev Arora, Zhiyuan Li, Kaifeng Lyu
ICLR 2018 A Compressed Sensing View of Unsupervised Text Embeddings, Bag-of-N-Grams, and LSTMs Sanjeev Arora, Mikhail Khodak, Nikunj Saunshi, Kiran Vodrahalli
COLT 2018 An Analysis of the T-SNE Algorithm for Data Visualization Sanjeev Arora, Wei Hu, Pravesh K. Kothari
ICLR 2018 Do GANs Learn the Distribution? Some Theory and Empirics Sanjeev Arora, Andrej Risteski, Yi Zhang
ICML 2018 On the Optimization of Deep Networks: Implicit Acceleration by Overparameterization Sanjeev Arora, Nadav Cohen, Elad Hazan
ICML 2018 Stronger Generalization Bounds for Deep Nets via a Compression Approach Sanjeev Arora, Rong Ge, Behnam Neyshabur, Yi Zhang
ICLR 2017 A Simple but Tough-to-Beat Baseline for Sentence Embeddings Sanjeev Arora, Yingyu Liang, Tengyu Ma
ICML 2017 Generalization and Equilibrium in Generative Adversarial Nets (GANs) Sanjeev Arora, Rong Ge, Yingyu Liang, Tengyu Ma, Yi Zhang
COLT 2017 On the Ability of Neural Nets to Express Distributions Holden Lee, Rong Ge, Tengyu Ma, Andrej Risteski, Sanjeev Arora
ICML 2016 Provable Algorithms for Inference in Topic Models Sanjeev Arora, Rong Ge, Frederic Koehler, Tengyu Ma, Ankur Moitra
COLT 2015 Simple, Efficient, and Neural Algorithms for Sparse Coding Sanjeev Arora, Rong Ge, Tengyu Ma, Ankur Moitra
COLT 2014 New Algorithms for Learning Incoherent and Overcomplete Dictionaries Sanjeev Arora, Rong Ge, Ankur Moitra
ICML 2014 Provable Bounds for Learning Some Deep Representations Sanjeev Arora, Aditya Bhaskara, Rong Ge, Tengyu Ma
ICML 2013 A Practical Algorithm for Topic Modeling with Provable Guarantees Sanjeev Arora, Rong Ge, Yonatan Halpern, David Mimno, Ankur Moitra, David Sontag, Yichen Wu, Michael Zhu
NeurIPS 2012 Provable ICA with Unknown Gaussian Noise, with Implications for Gaussian Mixtures and Autoencoders Sanjeev Arora, Rong Ge, Ankur Moitra, Sushant Sachdeva
NeurIPS 2002 A Note on the Representational Incompatibility of Function Approximation and Factored Dynamics Eric Allender, Sanjeev Arora, Michael Kearns, Cristopher Moore, Alexander Russell