Belkin, Mikhail

65 publications

COLT 2025 A Gap Between the Gaussian RKHS and Neural Networks: An Infinite-Center Asymptotic Analysis Akash Kumar, Rahul Parhi, Mikhail Belkin

ICML 2025 Emergence in Non-Neural Models: Grokking Modular Arithmetic via Average Gradient Outer Product Neil Rohit Mallinar, Daniel Beaglehole, Libin Zhu, Adityanarayanan Radhakrishnan, Parthe Pandit, Mikhail Belkin

NeurIPS 2025 Fast Training of Large Kernel Models with Delayed Projections Amirhesam Abedsoltan, Siyuan Ma, Parthe Pandit, Mikhail Belkin

NeurIPS 2025 Seeds of Structure: Patch PCA Reveals Universal Compositional Cues in Diffusion Models Qingsong Wang, Zhengchao Wan, Mikhail Belkin, Yusu Wang

ICML 2025 Task Generalization with Autoregressive Compositional Structure: Can Learning from $d$ Tasks Generalize to $D^T$ Tasks? Amirhesam Abedsoltan, Huaqing Zhang, Kaiyue Wen, Hongzhou Lin, Jingzhao Zhang, Mikhail Belkin

NeurIPS 2024 Average Gradient Outer Product as a Mechanism for Deep Neural Collapse Daniel Beaglehole, Peter Súkeník, Marco Mondelli, Mikhail Belkin

ICML 2024 Catapults in SGD: Spikes in the Training Loss and Their Impact on Generalization Through Feature Learning Libin Zhu, Chaoyue Liu, Adityanarayanan Radhakrishnan, Mikhail Belkin

NeurIPSW 2024 Emergence in Non-Neural Models: Grokking Modular Arithmetic via Average Gradient Outer Product Neil Rohit Mallinar, Daniel Beaglehole, Libin Zhu, Adityanarayanan Radhakrishnan, Parthe Pandit, Mikhail Belkin

ICLR 2024 More Is Better: When Infinite Overparameterization Is Optimal and Overfitting Is Obligatory James B Simon, Dhruva Karkada, Nikhil Ghosh, Mikhail Belkin

AISTATS 2024 On the Nyström Approximation for Preconditioning in Kernel Machines Amirhesam Abedsoltan, Parthe Pandit, Luis Rademacher, Mikhail Belkin

ICLR 2024 Quadratic Models for Understanding Catapult Dynamics of Neural Networks Libin Zhu, Chaoyue Liu, Adityanarayanan Radhakrishnan, Mikhail Belkin

UAI 2024 Uncertainty Estimation with Recursive Feature Machines Daniel Gedon, Amirhesam Abedsoltan, Thomas B. Schön, Mikhail Belkin

ICML 2023 Cut Your Losses with Squentropy Like Hui, Mikhail Belkin, Stephen Wright

UAI 2023 Neural Tangent Kernel at Initialization: Linear Width Suffices Arindam Banerjee, Pedro Cisneros-Velarde, Libin Zhu, Mikhail Belkin

NeurIPSW 2023 On Feature Learning of Recursive Feature Machines and Automatic Relevance Determination Daniel Gedon, Amirhesam Abedsoltan, Thomas B. Schön, Mikhail Belkin

NeurIPSW 2023 SGD Batch Saturation for Training Wide Neural Networks Chaoyue Liu, Dmitriy Drusvyatskiy, Mikhail Belkin, Damek Davis, Yian Ma

ICML 2023 Toward Large Kernel Models Amirhesam Abedsoltan, Mikhail Belkin, Parthe Pandit

JMLR 2021 Classification vs Regression in Overparameterized Regimes: Does the Loss Function Matter? Vidya Muthukumar, Adhyyan Narang, Vignesh Subramanian, Mikhail Belkin, Daniel Hsu, Anant Sahai

ICLR 2021 Evaluation of Neural Architectures Trained with Square Loss vs Cross-Entropy in Classification Tasks Like Hui, Mikhail Belkin

NeurIPS 2021 Multiple Descent: Design Your Own Generalization Curve Lin Chen, Yifei Min, Mikhail Belkin, Amin Karbasi

NeurIPS 2021 Risk Bounds for Over-Parameterized Maximum Margin Classification on Sub-Gaussian Mixtures Yuan Cao, Quanquan Gu, Mikhail Belkin

ICLR 2020 Accelerating SGD with Momentum for Over-Parameterized Learning Chaoyue Liu, Mikhail Belkin

AISTATS 2019 Does Data Interpolation Contradict Statistical Optimality? Mikhail Belkin, Alexander Rakhlin, Alexandre B. Tsybakov

ICMLW 2019 Memorization in Overparameterized Autoencoders Adityanarayanan Radhakrishnan, Mikhail Belkin, Caroline Uhler

COLT 2018 Approximation Beats Concentration? an Approximation View on Inference with Smooth Radial Kernels Mikhail Belkin

NeurIPS 2018 Overfitting or Perfect Fitting? Risk Bounds for Classification and Regression Rules That Interpolate Mikhail Belkin, Daniel J. Hsu, Partha Mitra

ICML 2018 The Power of Interpolation: Understanding the Effectiveness of SGD in Modern Over-Parametrized Learning Siyuan Ma, Raef Bassily, Mikhail Belkin

ICML 2018 To Understand Deep Learning We Need to Understand Kernel Learning Mikhail Belkin, Siyuan Ma, Soumik Mandal

ALT 2018 Unperturbed: Spectral Analysis Beyond Davis-Kahan Justin Eldridge, Mikhail Belkin, Yusu Wang

NeurIPS 2017 Diving into the Shallows: A Computational Perspective on Large-Scale Shallow Learning Siyuan Ma, Mikhail Belkin

AISTATS 2016 Back to the Future: Radial Basis Function Networks Revisited Qichao Que, Mikhail Belkin

COLT 2016 Basis Learning as an Algorithmic Primitive Mikhail Belkin, Luis Rademacher, James R. Voss

NeurIPS 2016 Clustering with Bregman Divergences: An Asymptotic Analysis Chaoyue Liu, Mikhail Belkin

NeurIPS 2016 Graphons, Mergeons, and so on! Justin Eldridge, Mikhail Belkin, Yusu Wang

ICML 2016 Learning Privately from Multiparty Data Jihun Hamm, Yingjun Cao, Mikhail Belkin

AAAI 2016 The Hidden Convexity of Spectral Clustering James R. Voss, Mikhail Belkin, Luis Rademacher

NeurIPS 2015 A Pseudo-Euclidean Iteration for Optimal Recovery in Noisy ICA James R Voss, Mikhail Belkin, Luis Rademacher

COLT 2015 Beyond Hartigan Consistency: Merge Distortion Metric for Hierarchical Clustering Justin Eldridge, Mikhail Belkin, Yusu Wang

NeurIPS 2014 Learning with Fredholm Kernels Qichao Que, Mikhail Belkin, Yusu Wang

COLT 2014 The More, the Merrier: The Blessing of Dimensionality for Learning Large Gaussian Mixtures Joseph Anderson, Mikhail Belkin, Navin Goyal, Luis Rademacher, James R. Voss

COLT 2013 Blind Signal Separation in the Presence of Gaussian Noise Mikhail Belkin, Luis Rademacher, James R. Voss

NeurIPS 2013 Fast Algorithms for Gaussian Noise Invariant Independent Component Analysis James R Voss, Luis Rademacher, Mikhail Belkin

NeurIPS 2013 Inverse Density as an Inverse Problem: The Fredholm Equation Approach Qichao Que, Mikhail Belkin

COLT 2012 Toward Understanding Complex Spaces: Graph Laplacians on Manifolds with Singularities and Boundaries Mikhail Belkin, Qichao Que, Yusu Wang, Xueyuan Zhou

NeurIPS 2011 Data Skeletonization via Reeb Graphs Xiaoyin Ge, Issam I. Safa, Mikhail Belkin, Yusu Wang

JMLR 2011 Laplacian Support Vector Machines Trained in the Primal Stefano Melacci, Mikhail Belkin

AISTATS 2011 Semi-Supervised Learning by Higher Order Regularization Xueyuan Zhou, Mikhail Belkin

JMLR 2010 On Learning with Integral Operators Lorenzo Rosasco, Mikhail Belkin, Ernesto De Vito

COLT 2010 Toward Learning Gaussian Mixtures with Arbitrary Separation Mikhail Belkin, Kaushik Sinha

COLT 2009 A Note on Learning with Integral Operators Lorenzo Rosasco, Mikhail Belkin, Ernesto De Vito

NeurIPS 2009 Semi-Supervised Learning Using Sparse Eigenfunction Bases Kaushik Sinha, Mikhail Belkin

ICML 2008 Data Spectroscopy: Learning Mixture Models Using Eigenspaces of Convolution Operators Tao Shi, Mikhail Belkin, Bin Yu

NeurIPS 2007 The Value of Labeled and Unlabeled Examples When the Model Is Imperfect Kaushik Sinha, Mikhail Belkin

NeurIPS 2006 Convergence of Laplacian Eigenmaps Mikhail Belkin, Partha Niyogi

JMLR 2006 Manifold Regularization: A Geometric Framework for Learning from Labeled and Unlabeled Examples Mikhail Belkin, Partha Niyogi, Vikas Sindhwani

NeurIPS 2006 On the Relation Between Low Density Separation, Spectral Clustering and Graph Cuts Hariharan Narayanan, Mikhail Belkin, Partha Niyogi

ICML 2005 Beyond the Point Cloud: From Transductive to Semi-Supervised Learning Vikas Sindhwani, Partha Niyogi, Mikhail Belkin

COLT 2005 Towards a Theoretical Foundation for Laplacian-Based Manifold Methods Mikhail Belkin, Partha Niyogi

NeurIPS 2004 Limits of Spectral Clustering Ulrike V. Luxburg, Olivier Bousquet, Mikhail Belkin

COLT 2004 On the Convergence of Spectral Clustering on Random Samples: The Normalized Case Ulrike von Luxburg, Olivier Bousquet, Mikhail Belkin

COLT 2004 Regularization and Semi-Supervised Learning on Large Graphs Mikhail Belkin, Irina Matveeva, Partha Niyogi

MLJ 2004 Semi-Supervised Learning on Riemannian Manifolds Mikhail Belkin, Partha Niyogi

NeCo 2003 Laplacian Eigenmaps for Dimensionality Reduction and Data Representation Mikhail Belkin, Partha Niyogi

NeurIPS 2002 Using Manifold Stucture for Partially Labeled Classification Mikhail Belkin, Partha Niyogi

NeurIPS 2001 Laplacian Eigenmaps and Spectral Techniques for Embedding and Clustering Mikhail Belkin, Partha Niyogi