Dettmers, Tim

20 publications

ICLR 2025 Holistically Evaluating the Environmental Impact of Creating Language Models Jacob Morrison, Clara Na, Jared Fernandez, Tim Dettmers, Emma Strubell, Jesse Dodge
ICLR 2025 OLMoE: Open Mixture-of-Experts Language Models Niklas Muennighoff, Luca Soldaini, Dirk Groeneveld, Kyle Lo, Jacob Morrison, Sewon Min, Weijia Shi, Evan Pete Walsh, Oyvind Tafjord, Nathan Lambert, Yuling Gu, Shane Arora, Akshita Bhagia, Dustin Schwenk, David Wadden, Alexander Wettig, Binyuan Hui, Tim Dettmers, Douwe Kiela, Ali Farhadi, Noah A. Smith, Pang Wei Koh, Amanpreet Singh, Hannaneh Hajishirzi
NeurIPS 2024 MatFormer: Nested Transformer for Elastic Inference Devvrit, Sneha Kudugunta, Aditya Kusupati, Tim Dettmers, Kaifeng Chen, Inderjit Dhillon, Yulia Tsvetkov, Hannaneh Hajishirzi, Sham Kakade, Ali Farhadi, Prateek Jain
NeurIPS 2024 Scaling Retrieval-Based Language Models with a Trillion-Token Datastore Rulin Shao, Jacqueline He, Akari Asai, Weijia Shi, Tim Dettmers, Sewon Min, Luke Zettlemoyer, Pang Wei Koh
ICMLW 2024 Seeded LoRA: Collaborative Fine-Tuning Through Seed Initialization of Adapters Alejandro R. Salamanca, Ahmet Üstün, Nicki Skafte Detlefsen, Tim Dettmers
ICLR 2024 SpQR: A Sparse-Quantized Representation for Near-Lossless LLM Weight Compression Tim Dettmers, Ruslan A. Svirschevski, Vage Egiazarian, Denis Kuznedelev, Elias Frantar, Saleh Ashkboos, Alexander Borzunov, Torsten Hoefler, Dan Alistarh
NeurIPS 2023 Distributed Inference and Fine-Tuning of Large Language Models over the Internet Alexander Borzunov, Max Ryabinin, Artem Chumachenko, Dmitry Baranchuk, Tim Dettmers, Younes Belkada, Pavel Samygin, Colin A Raffel
NeurIPSW 2023 MatFormer: Nested Transformer for Elastic Inference Fnu Devvrit, Sneha Kudugunta, Aditya Kusupati, Tim Dettmers, Kaifeng Chen, Inderjit S Dhillon, Yulia Tsvetkov, Hannaneh Hajishirzi, Sham M. Kakade, Ali Farhadi, Prateek Jain
NeurIPS 2023 QLoRA: Efficient Finetuning of Quantized LLMs Tim Dettmers, Artidoro Pagnoni, Ari Holtzman, Luke Zettlemoyer
ICML 2023 SWARM Parallelism: Training Large Models Can Be Surprisingly Communication-Efficient Max Ryabinin, Tim Dettmers, Michael Diskin, Alexander Borzunov
NeurIPS 2023 Stable and Low-Precision Training for Large-Scale Vision-Language Models Mitchell Wortsman, Tim Dettmers, Luke Zettlemoyer, Ari Morcos, Ali Farhadi, Ludwig Schmidt
ICML 2023 The Case for 4-Bit Precision: K-Bit Inference Scaling Laws Tim Dettmers, Luke Zettlemoyer
ICLR 2022 8-Bit Optimizers via Block-Wise Quantization Tim Dettmers, Mike Lewis, Sam Shleifer, Luke Zettlemoyer
NeurIPSW 2022 Branch-Train-Merge: Embarrassingly Parallel Training of Expert Language Models Margaret Li, Suchin Gururangan, Tim Dettmers, Mike Lewis, Tim Althoff, Noah A. Smith, Luke Zettlemoyer
NeurIPS 2022 GPT3.int8(): 8-Bit Matrix Multiplication for Transformers at Scale Tim Dettmers, Mike Lewis, Younes Belkada, Luke Zettlemoyer
NeurIPSW 2022 Petals: Collaborative Inference and Fine-Tuning of Large Models Alexander Borzunov, Dmitry Baranchuk, Tim Dettmers, Max Ryabinin, Younes Belkada, Artem Chumachenko, Pavel Samygin, Colin Raffel
ICML 2021 BASE Layers: Simplifying Training of Large, Sparse Models Mike Lewis, Shruti Bhosale, Tim Dettmers, Naman Goyal, Luke Zettlemoyer
ICLR 2020 Sparse Networks from Scratch: Faster Training Without Losing Performance Tim Dettmers, Luke Zettlemoyer
AAAI 2018 Convolutional 2D Knowledge Graph Embeddings Tim Dettmers, Pasquale Minervini, Pontus Stenetorp, Sebastian Riedel
ICLR 2016 8-Bit Approximations for Parallelism in Deep Learning Tim Dettmers