Nagel, Markus

24 publications

TMLR 2025 Mixture of Cache-Conditional Experts for Efficient Mobile Device Inference Andrii Skliar, Ties van Rozendaal, Romain Lepert, Todor Boinovski, Mart Van Baalen, Markus Nagel, Paul N. Whatmough, Babak Ehteshami Bejnordi
ICMLW 2024 GPTVQ: The Blessing of Dimensionality for LLM Quantization Mart Van Baalen, Andrey Kuzmin, Markus Nagel, Peter Couperus, Artem Bolshakov, Cedric Bastoul, Eric Mahurin, Tijmen Blankevoort, Paul Whatmough
ICLRW 2024 How to Parameterize Asymmetric Quantization Ranges for Quantization-Aware Training Jaeseong You, Minseop Park, Markus Nagel, Kyunggeun Lee, Seokjun An, Chirag S Patel
ICMLW 2024 Low Rank Quantization-Aware Training for LLMs Yelysei Bondarenko, Riccardo Del Chiaro, Markus Nagel
WACV 2024 MobileNVC: Real-Time 1080p Neural Video Compression on a Mobile Device Ties van Rozendaal, Tushar Singhal, Hoang Le, Guillaume Sautiere, Amir Said, Krishna Buska, Anjuman Raha, Dimitris Kalatzis, Hitarth Mehta, Frank Mayer, Liang Zhang, Markus Nagel, Auke Wiggers
NeurIPSW 2024 Optimizing Attention Hanno Ackermann, Hong Cai, Markus Nagel, Leyla Mirvakhabova, Farhad G. Zanjani, Fatih Porikli
ICMLW 2024 Rapid Switching and Multi-Adapter Fusion via Sparse High Rank Adapters Kartikeya Bhardwaj, Nilesh Prasad Pandey, Sweta Priyadarshi, Viswanath Ganapathy, Rafael Esteves, Shreya Kadambi, Shubhankar Borse, Paul Whatmough, Risheek Garrepalli, Mart Van Baalen, Harris Teague, Markus Nagel
NeurIPS 2024 Sparse High Rank Adapters Kartikeya Bhardwaj, Nilesh Prasad Pandey, Sweta Priyadarshi, Viswanath Ganapathy, Shreya Kadambi, Rafael Esteves, Shubhankar Borse, Paul Whatmough, Risheek Garrepalli, Mart Van Baalen, Harris Teague, Markus Nagel
ICLR 2024 The LLM Surgeon Tycho F. A. van der Ouderaa, Markus Nagel, Mart Van Baalen, Tijmen Blankevoort
NeurIPS 2023 Pruning vs Quantization: Which Is Better? Andrey Kuzmin, Markus Nagel, Mart van Baalen, Arash Behboodi, Tijmen Blankevoort
ICCVW 2023 QBitOpt: Fast and Accurate Bitwidth Reallocation During Training Jorn Peters, Marios Fournarakis, Markus Nagel, Mart van Baalen, Tijmen Blankevoort
NeurIPS 2023 Quantizable Transformers: Removing Outliers by Helping Attention Heads Do Nothing Yelysei Bondarenko, Markus Nagel, Tijmen Blankevoort
TMLR 2023 Quantization Robust Federated Learning for Efficient Inference on Heterogeneous Devices Kartik Gupta, Marios Fournarakis, Matthias Reisser, Christos Louizos, Markus Nagel
ICCV 2023 ResQ: Residual Quantization for Video Perception Davide Abati, Haitam Ben Yahia, Markus Nagel, Amirhossein Habibian
ICCVW 2023 SoftMax Bias Correction for Quantized Generative Models Nilesh Prasad Pandey, Marios Fournarakis, Chirag Patel, Markus Nagel
CVPRW 2022 Cyclical Pruning for Sparse Neural Networks Suraj Srinivas, Andrey Kuzmin, Markus Nagel, Mart van Baalen, Andrii Skliar, Tijmen Blankevoort
NeurIPS 2022 FP8 Quantization: The Power of the Exponent Andrey Kuzmin, Mart van Baalen, Yuwei Ren, Markus Nagel, Jorn Peters, Tijmen Blankevoort
ICLRW 2022 Implicit Neural Video Compression Yunfan Zhang, Ties van Rozendaal, Johann Brehmer, Markus Nagel, Taco Cohen
ICML 2022 Overcoming Oscillations in Quantization-Aware Training Markus Nagel, Marios Fournarakis, Yelysei Bondarenko, Tijmen Blankevoort
CVPRW 2022 Simulated Quantization, Real Power Savings Mart van Baalen, Brian Kahne, Eric Mahurin, Andrey Kuzmin, Andrii Skliar, Markus Nagel, Tijmen Blankevoort
CVPRW 2021 In-Hindsight Quantization Range Estimation for Quantized Training Marios Fournarakis, Markus Nagel
NeurIPS 2020 Bayesian Bits: Unifying Quantization and Pruning Mart van Baalen, Christos Louizos, Markus Nagel, Rana Ali Amjad, Ying Wang, Tijmen Blankevoort, Max Welling
CVPRW 2020 LSQ+: Improving Low-Bit Quantization Through Learnable Offsets and Better Initialization Yash Bhalgat, Jinwon Lee, Markus Nagel, Tijmen Blankevoort, Nojun Kwak
ICML 2020 Up or Down? Adaptive Rounding for Post-Training Quantization Markus Nagel, Rana Ali Amjad, Mart Van Baalen, Christos Louizos, Tijmen Blankevoort