Alistarh, Dan

75 publications

ICML 2025 Cache Me if You Must: Adaptive Key-Value Quantization for Large Language Models Alina Shutova, Vladimir Malinovskii, Vage Egiazarian, Denis Kuznedelev, Denis Mazur, Surkov Nikita, Ivan Ermakov, Dan Alistarh
ICLRW 2025 Cheap and Effective Personalization of Foundation Language Models for Imitating a User's Writing Style Armand Mihai Nicolicioiu, Eugenia Iofinova, Andrej Jovanovic, Eldar Kurtic, Mahdi Nikdan, Andrei Panferov, Ilia Markov, Nir N Shavit, Dan Alistarh
NeurIPS 2025 Efficient Data Selection at Scale via Influence Distillation Mahdi Nikdan, Vincent Cohen-Addad, Dan Alistarh, Vahab Mirrokni
ICML 2025 EvoPress: Accurate Dynamic Model Compression via Evolutionary Search Oliver Sieberling, Denis Kuznedelev, Eldar Kurtic, Dan Alistarh
ICLRW 2025 EvoPress: Accurate Dynamic Model Compression via Evolutionary Search Oliver Sieberling, Denis Kuznedelev, Dan Alistarh
NeurIPS 2025 HALO: Hadamard-Assisted Lower-Precision Optimization for LLMs Saleh Ashkboos, Mahdi Nikdan, Soroush Tabesh, Roberto L. Castro, Torsten Hoefler, Dan Alistarh
NeurIPS 2025 Hogwild! Inference: Parallel LLM Generation via Concurrent Attention Gleb Rodionov, Roman Garipov, Alina Shutova, George Yakushev, Erik Schultheis, Vage Egiazarian, Anton Sinitsin, Denis Kuznedelev, Dan Alistarh
AAAI 2025 Hybrid Decentralized Optimization: Leveraging Both First- and Zeroth-Order Optimizers for Faster Convergence Shayan Talaei, Matin Ansaripour, Giorgi Nadiradze, Dan Alistarh
ICLR 2025 LDAdam: Adaptive Optimization from Low-Dimensional Gradient Statistics Thomas Robert, Mher Safaryan, Ionut-Vlad Modoranu, Dan Alistarh
ICML 2025 Layer-Wise Quantization for Quantized Optimistic Dual Averaging Anh Duc Nguyen, Ilia Markov, Zhengqing Wu, Ali Ramezani-Kebrya, Kimon Antonakopoulos, Dan Alistarh, Volkan Cevher
ICML 2025 QuEST: Stable Training of LLMs with 1-Bit Weights and Activations Andrei Panferov, Jiale Chen, Soroush Tabesh, Mahdi Nikdan, Dan Alistarh
ICLRW 2025 QuEST: Training Accurate LLMs over Highly-Compressed Weights and Activation Andrei Panferov, Jiale Chen, Soroush Tabesh, Roberto L. Castro, Mahdi Nikdan, Dan Alistarh
NeurIPS 2025 Quartet: Native FP4 Training Can Be Optimal for Large Language Models Roberto L. Castro, Andrei Panferov, Soroush Tabesh, Oliver Sieberling, Jiale Chen, Mahdi Nikdan, Saleh Ashkboos, Dan Alistarh
ICLRW 2025 Recovery-on-the-Line: Linear Trends in Post-Quantization Performance Recovery Shashata Sawmya, Shuvom Sadhuka, Ragulan Sivakumar, Nir N Shavit, Dan Alistarh, Bonnie Berger
ICLR 2025 Scalable Mechanistic Neural Networks Jiale Chen, Dingling Yao, Adeel Pervez, Dan Alistarh, Francesco Locatello
TMLR 2025 TACO Vision Models Can Be Efficiently Specialized via Few-Shot Task-Aware Compression Denis Kuznedelev, Soroush Tabesh, Kimia Noorbakhsh, Elias Frantar, Sara Beery, Eldar Kurtic, Dan Alistarh
ICLR 2025 The Journey Matters: Average Parameter Count over Pre-Training Unifies Sparse and Dense Scaling Laws Tian Jin, Ahmed Imtiaz Humayun, Utku Evci, Suvinay Subramanian, Amir Yazdanbakhsh, Dan Alistarh, Gintare Karolina Dziugaite
NeurIPS 2025 Unified Scaling Laws for Compressed Representations Andrei Panferov, Alexandra Volkova, Ionut-Vlad Modoranu, Vage Egiazarian, Mher Safaryan, Dan Alistarh
ICLR 2025 Wasserstein Distances, Neuronal Entanglement, and Sparsity Shashata Sawmya, Linghao Kong, Ilia Markov, Dan Alistarh, Nir N Shavit
TMLR 2024 Accurate Neural Network Pruning Requires Rethinking Sparse Optimization Denis Kuznedelev, Eldar Kurtic, Eugenia Iofinova, Elias Frantar, Alexandra Peste, Dan Alistarh
AISTATS 2024 AsGrad: A Sharp Unified Analysis of Asynchronous-SGD Algorithms Rustem Islamov, Mher Safaryan, Dan Alistarh
AISTATS 2024 Communication-Efficient Federated Learning with Data and Client Heterogeneity Hossein Zakerinia, Shayan Talaei, Giorgi Nadiradze, Dan Alistarh
ICML 2024 Error Feedback Can Accurately Compress Preconditioners Ionut-Vlad Modoranu, Aleksei Kalinov, Eldar Kurtic, Elias Frantar, Dan Alistarh
ICML 2024 Extreme Compression of Large Language Models via Additive Quantization Vage Egiazarian, Andrei Panferov, Denis Kuznedelev, Elias Frantar, Artem Babenko, Dan Alistarh
CPAL 2024 How to Prune Your Language Model: Recovering Accuracy on the “Sparsity May Cry” Benchmark Eldar Kurtic, Torsten Hoefler, Dan Alistarh
NeurIPSW 2024 Layer-Wise Quantization for Distributed Variational Inequalities Anh Duc Nguyen, Ilia Markov, Ali Ramezani-Kebrya, Kimon Antonakopoulos, Dan Alistarh, Volkan Cevher
NeurIPS 2024 MicroAdam: Accurate Adaptive Optimization with Low Space Overhead and Provable Convergence Ionut-Vlad Modoranu, Mher Safaryan, Grigory Malinovsky, Eldar Kurtic, Thomas Robert, Peter Richtárik, Dan Alistarh
NeurIPS 2024 PV-Tuning: Beyond Straight-Through Estimation for Extreme LLM Compression Vladimir Malinovskii, Denis Mazur, Ivan Ilin, Denis Kuznedelev, Konstantin Burlachenko, Kai Yi, Dan Alistarh, Peter Richtarik
NeurIPS 2024 QuaRot: Outlier-Free 4-Bit Inference in Rotated LLMs Saleh Ashkboos, Amirkeivan Mohtashami, Maximilian L. Croci, Bo Li, Pashmina Cameron, Martin Jaggi, Dan Alistarh, Torsten Hoefler, James Hensman
ICML 2024 RoSA: Accurate Parameter-Efficient Fine-Tuning via Robust Adaptation Mahdi Nikdan, Soroush Tabesh, Elvir Crnčević, Dan Alistarh
ICML 2024 SPADE: Sparsity-Guided Debugging for Deep Neural Networks Arshia Soltani Moakhar, Eugenia Iofinova, Elias Frantar, Dan Alistarh
ICLR 2024 Scaling Laws for Sparsely-Connected Foundation Models Elias Frantar, Carlos Riquelme Ruiz, Neil Houlsby, Dan Alistarh, Utku Evci
ICLR 2024 SpQR: A Sparse-Quantized Representation for Near-Lossless LLM Weight Compression Tim Dettmers, Ruslan A. Svirschevski, Vage Egiazarian, Denis Kuznedelev, Elias Frantar, Saleh Ashkboos, Alexander Borzunov, Torsten Hoefler, Dan Alistarh
NeurIPS 2024 The Iterative Optimal Brain Surgeon: Faster Sparse Recovery by Leveraging Second-Order Information Diyuan Wu, Ionut-Vlad Modoranu, Mher Safaryan, Denis Kuznedelev, Dan Alistarh
CVPR 2023 Bias in Pruned Vision Models: In-Depth Analysis and Countermeasures Eugenia Iofinova, Alexandra Peste, Dan Alistarh
NeurIPS 2023 CAP: Correlation-Aware Pruning for Highly-Accurate Sparse Vision Models Denis Kuznedelev, Eldar Kurtić, Elias Frantar, Dan Alistarh
ICLR 2023 CrAM: A Compression-Aware Minimizer Alexandra Peste, Adrian Vladu, Eldar Kurtic, Christoph H Lampert, Dan Alistarh
NeurIPSW 2023 Decentralized Learning Dynamics in the Gossip Model John Lazarsfeld, Dan Alistarh
ICMLW 2023 Generating Efficient Kernels for Quantized Inference on Large Language Models Tommaso Pegolotti, Elias Frantar, Dan Alistarh, Markus Püschel
NeurIPS 2023 Knowledge Distillation Performs Partial Variance Reduction Mher Safaryan, Alexandra Peste, Dan Alistarh
ICLR 2023 OPTQ: Accurate Quantization for Generative Pre-Trained Transformers Elias Frantar, Saleh Ashkboos, Torsten Hoefler, Dan Alistarh
ICML 2023 Quantized Distributed Training of Large Models with Convergence Guarantees Ilia Markov, Adrian Vladu, Qi Guo, Dan Alistarh
ICML 2023 SparseGPT: Massive Language Models Can Be Accurately Pruned in One-Shot Elias Frantar, Dan Alistarh
ICML 2023 SparseProp: Efficient Sparse Backpropagation for Faster Training of Neural Networks at the Edge Mahdi Nikdan, Tommaso Pegolotti, Eugenia Iofinova, Eldar Kurtic, Dan Alistarh
NeurIPS 2023 ZipLM: Inference-Aware Structured Pruning of Language Models Eldar Kurtić, Elias Frantar, Dan Alistarh
ICMLW 2023 ZipLM: Inference-Aware Structured Pruning of Language Models Eldar Kurtic, Elias Frantar, Dan Alistarh
CVPR 2022 How Well Do Sparse ImageNet Models Transfer? Eugenia Iofinova, Alexandra Peste, Mark Kurtz, Dan Alistarh
NeurIPS 2022 Optimal Brain Compression: A Framework for Accurate Post-Training Quantization and Pruning Elias Frantar, Dan Alistarh
ICML 2022 SPDY: Accurate Pruning with Speedup Guarantees Elias Frantar, Dan Alistarh
NeurIPS 2021 AC/DC: Alternating Compressed/DeCompressed Training of Deep Neural Networks Alexandra Peste, Eugenia Iofinova, Adrian Vladu, Dan Alistarh
NeurIPS 2021 Asynchronous Decentralized SGD with Quantized and Local Updates Giorgi Nadiradze, Amirmojtaba Sabour, Peter Davies, Shigang Li, Dan Alistarh
AAAI 2021 Asynchronous Optimization Methods for Efficient Training of Deep Neural Networks with Guarantees Vyacheslav Kungurtsev, Malcolm Egan, Bapi Chatterjee, Dan Alistarh
ICLR 2021 Byzantine-Resilient Non-Convex Stochastic Gradient Descent Zeyuan Allen-Zhu, Faeze Ebrahimianghazani, Jerry Li, Dan Alistarh
ICML 2021 Communication-Efficient Distributed Optimization with Quantized Preconditioners Foivos Alimisis, Peter Davies, Dan Alistarh
NeurIPS 2021 Distributed Principal Component Analysis with Limited Communication Foivos Alimisis, Peter Davies, Bart Vandereycken, Dan Alistarh
AAAI 2021 Elastic Consistency: A Practical Consistency Model for Distributed Stochastic Gradient Descent Giorgi Nadiradze, Ilia Markov, Bapi Chatterjee, Vyacheslav Kungurtsev, Dan Alistarh
NeurIPS 2021 M-FAC: Efficient Matrix-Free Approximations of Second-Order Information Elias Frantar, Eldar Kurtic, Dan Alistarh
JMLR 2021 NUQSGD: Provably Communication-Efficient Data-Parallel SGD via Nonuniform Quantization Ali Ramezani-Kebrya, Fartash Faghri, Ilya Markov, Vitalii Aksenov, Dan Alistarh, Daniel M. Roy
ICLR 2021 New Bounds for Distributed Mean Estimation and Variance Reduction Peter Davies, Vijaykrishna Gurunanthan, Niusha Moshrefi, Saleh Ashkboos, Dan Alistarh
NeurIPSW 2021 SSSE: Efficiently Erasing Samples from Trained Machine Learning Models Alexandra Peste, Dan Alistarh, Christoph H Lampert
JMLR 2021 Sparsity in Deep Learning: Pruning and Growth for Efficient Inference and Training in Neural Networks Torsten Hoefler, Dan Alistarh, Tal Ben-Nun, Nikoli Dryden, Alexandra Peste
NeurIPS 2021 Towards Tight Communication Lower Bounds for Distributed Optimisation Janne H. Korhonen, Dan Alistarh
NeurIPS 2020 Adaptive Gradient Quantization for Data-Parallel SGD Fartash Faghri, Iman Tabrizian, Ilia Markov, Dan Alistarh, Daniel M. Roy, Ali Ramezani-Kebrya
ICML 2020 Inducing and Exploiting Activation Sparsity for Fast Inference on Deep Neural Networks Mark Kurtz, Justin Kopinsky, Rati Gelashvili, Alexander Matveev, John Carr, Michael Goin, William Leiserson, Sage Moore, Nir Shavit, Dan Alistarh
ICML 2020 On the Sample Complexity of Adversarial Multi-Source PAC Learning Nikola Konstantinov, Elias Frantar, Dan Alistarh, Christoph Lampert
NeurIPS 2020 Scalable Belief Propagation via Relaxed Scheduling Vitalii Aksenov, Dan Alistarh, Janne H. Korhonen
NeurIPS 2020 WoodFisher: Efficient Second-Order Approximation for Neural Network Compression Sidak Pal Singh, Dan Alistarh
ICML 2019 Distributed Learning over Unreliable Networks Chen Yu, Hanlin Tang, Cedric Renggli, Simon Kassing, Ankit Singla, Dan Alistarh, Ce Zhang, Ji Liu
NeurIPS 2019 Powerset Convolutional Neural Networks Chris Wendler, Markus Püschel, Dan Alistarh
NeurIPS 2018 Byzantine Stochastic Gradient Descent Dan Alistarh, Zeyuan Allen-Zhu, Jerry Li
ICLR 2018 Model Compression via Distillation and Quantization Antonio Polino, Razvan Pascanu, Dan Alistarh
NeurIPS 2018 The Convergence of Sparsified Gradient Methods Dan Alistarh, Torsten Hoefler, Mikael Johansson, Nikola Konstantinov, Sarit Khirirat, Cedric Renggli
NeurIPS 2017 QSGD: Communication-Efficient SGD via Gradient Quantization and Encoding Dan Alistarh, Demjan Grubic, Jerry Li, Ryota Tomioka, Milan Vojnovic
ICML 2017 ZipML: Training Linear Models with End-to-End Low Precision, and a Little Bit of Deep Learning Hantian Zhang, Jerry Li, Kaan Kara, Dan Alistarh, Ji Liu, Ce Zhang
NeurIPS 2015 Streaming Min-Max Hypergraph Partitioning Dan Alistarh, Jennifer Iglesias, Milan Vojnovic