Reddi, Sashank J.

30 publications

ICML 2025 Bipartite Ranking from Multiple Labels: On Loss Versus Label Aggregation Michal Lukasik, Lin Chen, Harikrishna Narasimhan, Aditya Krishna Menon, Wittawat Jitkrittum, Felix X. Yu, Sashank J. Reddi, Gang Fu, Mohammadhossein Bateni, Sanjiv Kumar
ICLR 2025 Efficient Stagewise Pretraining via Progressive Subnetworks Abhishek Panigrahi, Nikunj Saunshi, Kaifeng Lyu, Sobhan Miryoosefi, Sashank J. Reddi, Satyen Kale, Sanjiv Kumar
ICLR 2025 Reasoning with Latent Thoughts: On the Power of Looped Transformers Nikunj Saunshi, Nishanth Dikkala, Zhiyuan Li, Sanjiv Kumar, Sashank J. Reddi
ICML 2025 Structured Preconditioners in Adaptive Optimization: A Unified Analysis Shuo Xie, Tianhao Wang, Sashank J. Reddi, Sanjiv Kumar, Zhiyuan Li
ICML 2024 Can Looped Transformers Learn to Implement Multi-Step Gradient Descent for In-Context Learning? Khashayar Gatmiry, Nikunj Saunshi, Sashank J. Reddi, Stefanie Jegelka, Sanjiv Kumar
ICMLW 2024 Efficient Document Ranking with Learnable Late Interactions Himanshu Jain, Ziwei Ji, Sashank J. Reddi, Ankit Singh Rawat, Felix Yu, Aditya Krishna Menon, Sadeep Jayasumana
ICMLW 2024 Efficient Document Ranking with Learnable Late Interactions Himanshu Jain, Ziwei Ji, Ankit Singh Rawat, Andreas Veit, Sadeep Jayasumana, Sashank J. Reddi, Aditya Krishna Menon, Felix Yu
NeurIPS 2024 On the Inductive Bias of Stacking Towards Improving Reasoning Nikunj Saunshi, Stefani Karp, Shankar Krishnan, Sobhan Miryoosefi, Sashank J. Reddi, Sanjiv Kumar
ICML 2024 Simplicity Bias via Global Convergence of Sharpness Minimization Khashayar Gatmiry, Zhiyuan Li, Sashank J. Reddi, Stefanie Jegelka
ICLR 2023 Differentially Private Adaptive Optimization with Delayed Preconditioners Tian Li, Manzil Zaheer, Ken Liu, Sashank J. Reddi, Hugh Brendan McMahan, Virginia Smith
ICML 2023 Efficient Training of Language Models Using Few-Shot Learning Sashank J. Reddi, Sobhan Miryoosefi, Stefani Karp, Shankar Krishnan, Satyen Kale, Seungyeon Kim, Sanjiv Kumar
ICLR 2023 The Lazy Neuron Phenomenon: On Emergence of Activation Sparsity in Transformers Zonglin Li, Chong You, Srinadh Bhojanapalli, Daliang Li, Ankit Singh Rawat, Sashank J. Reddi, Ke Ye, Felix Chern, Felix Yu, Ruiqi Guo, Sanjiv Kumar
NeurIPSW 2022 Differentially Private Adaptive Optimization with Delayed Preconditioners Tian Li, Manzil Zaheer, Ken Liu, Sashank J. Reddi, Hugh Brendan McMahan, Virginia Smith
ICLR 2021 Adaptive Federated Optimization Sashank J. Reddi, Zachary Charles, Manzil Zaheer, Zachary Garrett, Keith Rush, Jakub Konečný, Sanjiv Kumar, Hugh Brendan McMahan
ICLR 2020 Are Transformers Universal Approximators of Sequence-to-Sequence Functions? Chulhee Yun, Srinadh Bhojanapalli, Ankit Singh Rawat, Sashank J. Reddi, Sanjiv Kumar
ICLR 2020 Can Gradient Clipping Mitigate Label Noise? Aditya Krishna Menon, Ankit Singh Rawat, Sashank J. Reddi, Sanjiv Kumar
AISTATS 2019 Stochastic Negative Mining for Learning with Large Output Spaces Sashank J. Reddi, Satyen Kale, Felix Yu, Daniel Holtmann-Rice, Jiecao Chen, Sanjiv Kumar
AISTATS 2018 A Generic Approach for Escaping Saddle Points Sashank J. Reddi, Manzil Zaheer, Suvrit Sra, Barnabás Póczos, Francis R. Bach, Ruslan Salakhutdinov, Alexander J. Smola
ICLR 2018 On the Convergence of Adam and Beyond Sashank J. Reddi, Satyen Kale, Sanjiv Kumar
NeurIPS 2016 Proximal Stochastic Methods for Nonsmooth Nonconvex Finite-Sum Optimization Sashank J. Reddi, Suvrit Sra, Barnabas Poczos, Alexander J Smola
NeurIPS 2016 Riemannian SVRG: Fast Stochastic Optimization on Riemannian Manifolds Hongyi Zhang, Sashank J. Reddi, Suvrit Sra
ICML 2016 Stochastic Variance Reduction for Nonconvex Optimization Sashank J. Reddi, Ahmed Hefny, Suvrit Sra, Barnabas Poczos, Alex Smola
NeurIPS 2016 Variance Reduction in Stochastic Gradient Langevin Dynamics Kumar Avinava Dubey, Sashank J. Reddi, Sinead A Williamson, Barnabas Poczos, Alexander J Smola, Eric P Xing
UAI 2015 Communication Efficient Coresets for Empirical Loss Minimization Sashank J. Reddi, Barnabás Póczos, Alexander J. Smola
UAI 2015 Large-Scale Randomized-Coordinate Descent Methods with Non-Separable Linear Constraints Sashank J. Reddi, Ahmed Hefny, Carlton Downey, Avinava Dubey, Suvrit Sra
NeurIPS 2015 On Variance Reduction in Stochastic Gradient Descent and Its Asynchronous Variants Sashank J. Reddi, Ahmed Hefny, Suvrit Sra, Barnabas Poczos, Alexander J Smola
AISTATS 2015 On the High Dimensional Power of a Linear-Time Two Sample Test Under Mean-Shift Alternatives Sashank J. Reddi, Aaditya Ramdas, Barnabás Póczos, Aarti Singh, Larry A. Wasserman
UAI 2014 k-NN Regression on Functional Data with Incomplete Observations Sashank J. Reddi, Barnabás Póczos
ICML 2013 Scale Invariant Conditional Dependence Measures Sashank J Reddi, Barnabas Poczos
NeurIPS 2010 MAP Estimation in Binary MRFs via Bipartite Multi-Cuts Sashank J. Reddi, Sunita Sarawagi, Sundar Vishwanathan