Mishchenko, Konstantin

22 publications

TMLR 2025 Partially Personalized Federated Learning: Breaking the Curse of Data Heterogeneity Konstantin Mishchenko, Rustem Islamov, Eduard Gorbunov, Samuel Horváth
NeurIPS 2024 Adaptive Proximal Gradient Method for Convex Optimization Yura Malitsky, Konstantin Mishchenko
ICML 2024 Prodigy: An Expeditiously Adaptive Parameter-Free Learner Konstantin Mishchenko, Aaron Defazio
NeurIPS 2024 The Road Less Scheduled Aaron Defazio, Xingyu Yang, Harsh Mehta, Konstantin Mishchenko, Ahmed Khaled, Ashok Cutkosky
ICMLW 2023 Convergence of First-Order Algorithms for Meta-Learning with Moreau Envelopes Konstantin Mishchenko, Slavomir Hanzely, Peter Richtárik
NeurIPS 2023 DoWG Unleashed: An Efficient Universal Parameter-Free Gradient Descent Method Ahmed Khaled, Konstantin Mishchenko, Chi Jin
ICML 2023 Learning-Rate-Free Learning by D-Adaptation Aaron Defazio, Konstantin Mishchenko
NeurIPSW 2023 Noise Injection Irons Out Local Minima and Saddle Points Konstantin Mishchenko, Sebastian U Stich
ICML 2023 Two Losses Are Better than One: Faster Optimization Using a Cheaper Proxy Blake Woodworth, Konstantin Mishchenko, Francis Bach
NeurIPS 2022 Asynchronous SGD Beats Minibatch SGD Under Arbitrary Delays Konstantin Mishchenko, Francis R. Bach, Mathieu Even, Blake E Woodworth
ICLR 2022 IntSGD: Adaptive Floatless Compression of Stochastic Gradients Konstantin Mishchenko, Bokun Wang, Dmitry Kovalev, Peter Richtárik
NeurIPSW 2022 Parameter Free Dual Averaging: Optimizing Lipschitz Functions in a Single Pass Aaron Defazio, Konstantin Mishchenko
ICML 2022 ProxSkip: Yes! Local Gradient Steps Provably Lead to Communication Acceleration! Finally! Konstantin Mishchenko, Grigory Malinovsky, Sebastian Stich, Peter Richtarik
ICML 2022 Proximal and Federated Random Reshuffling Konstantin Mishchenko, Ahmed Khaled, Peter Richtarik
UAI 2020 99% of Worker-Master Communication in Distributed Optimization Is Not Needed Konstantin Mishchenko, Filip Hanzely, Peter Richtarik
ICML 2020 Adaptive Gradient Descent Without Descent Yura Malitsky, Konstantin Mishchenko
AISTATS 2020 DAve-QN: A Distributed Averaged Quasi-Newton Method with Local Superlinear Convergence Rate Saeed Soori, Konstantin Mishchenko, Aryan Mokhtari, Maryam Mehri Dehnavi, Mert Gurbuzbalaban
NeurIPS 2020 Random Reshuffling: Simple Analysis with Vast Improvements Konstantin Mishchenko, Ahmed Khaled, Peter Richtarik
AISTATS 2020 Revisiting Stochastic Extragradient Konstantin Mishchenko, Dmitry Kovalev, Egor Shulgin, Peter Richtarik, Yura Malitsky
AISTATS 2020 Tighter Theory for Local SGD on Identical and Heterogeneous Data Ahmed Khaled, Konstantin Mishchenko, Peter Richtarik
ICML 2018 A Delay-Tolerant Proximal-Gradient Algorithm for Distributed Learning Konstantin Mishchenko, Franck Iutzeler, Jérôme Malick, Massih-Reza Amini
NeurIPS 2018 SEGA: Variance Reduction via Gradient Sketching Filip Hanzely, Konstantin Mishchenko, Peter Richtarik