ML Anthology
Authors
Search
About
Messmer, Bettina
8 publications
NeurIPS
2025
Enhancing Multilingual LLM Pretraining with Model-Based Data Selection
Bettina Messmer
,
Vinko Sabolčec
,
Martin Jaggi
ICLRW
2025
Enhancing Multilingual LLM Pretraining with Model-Based Data Selection
Bettina Messmer
,
Vinko Sabolčec
,
Martin Jaggi
ICML
2025
On-Device Collaborative Language Modeling via a Mixture of Generalists and Specialists
Dongyang Fan
,
Bettina Messmer
,
Nikita Doikov
,
Martin Jaggi
ICLRW
2025
On-Device Collaborative Language Modeling via a Mixture of Generalists and Specialists
Dongyang Fan
,
Bettina Messmer
,
Nikita Doikov
,
Martin Jaggi
ICMLW
2024
Analyzing & Eliminating Learning Rate Warmup in GPT Pre-Training
Atli Kosson
,
Bettina Messmer
,
Martin Jaggi
NeurIPS
2024
Analyzing & Reducing the Need for Learning Rate Warmup in GPT Training
Atli Kosson
,
Bettina Messmer
,
Martin Jaggi
ICML
2024
Rotational Equilibrium: How Weight Decay Balances Learning Across Neural Networks
Atli Kosson
,
Bettina Messmer
,
Martin Jaggi
NeurIPSW
2023
Rotational Equilibrium: How Weight Decay Balances Learning Across Neural Networks
Atli Kosson
,
Bettina Messmer
,
Martin Jaggi