ML Anthology
Authors
Search
About
Thérien, Benjamin
11 publications
TMLR
2025
Continual Pre-Training of MoEs: How Robust Is Your Router?
Benjamin Thérien
,
Charles-Étienne Joseph
,
Zain Sarwar
,
Ashwinee Panda
,
Anirban Das
,
Shi-Xiong Zhang
,
Stephen Rawls
,
Sambit Sahu
,
Eugene Belilovsky
,
Irina Rish
NeurIPS
2025
Dense Backpropagation Improves Training for Sparse Mixture-of-Experts
Ashwinee Panda
,
Vatsal Baherwani
,
Zain Sarwar
,
Benjamin Thérien
,
Sambit Sahu
,
Tom Goldstein
,
Supriyo Chakraborty
TMLR
2025
Meta-Learning Optimizers for Communication-Efficient Learning
Charles-Étienne Joseph
,
Benjamin Thérien
,
Abhinav Moudgil
,
Boris Knyazev
,
Eugene Belilovsky
NeurIPSW
2024
$\mu$LO: Compute-Efficient Meta-Generalization of Learned Optimizers
Benjamin Thérien
,
Charles-Étienne Joseph
,
Boris Knyazev
,
Edouard Oyallon
,
Irina Rish
,
Eugene Belilovsky
NeurIPSW
2024
Dense Backpropagation Improves Routing for Sparsely-Gated Mixture-of-Experts
Ashwinee Panda
,
Vatsal Baherwani
,
Zain Sarwar
,
Benjamin Thérien
,
Stephen Rawls
,
Sambit Sahu
,
Supriyo Chakraborty
,
Tom Goldstein
NeurIPSW
2024
Dense Backpropagation Improves Routing for Sparsely-Gated Mixture-of-Experts
Ashwinee Panda
,
Vatsal Baherwani
,
Zain Sarwar
,
Benjamin Thérien
,
Stephen Rawls
,
Sambit Sahu
,
Supriyo Chakraborty
,
Tom Goldstein
WACV
2024
Object Re-Identification from Point Clouds
Benjamin Thérien
,
Chengjie Huang
,
Adrian Chow
,
Krzysztof Czarnecki
TMLR
2024
Simple and Scalable Strategies to Continually Pre-Train Large Language Models
Adam Ibrahim
,
Benjamin Thérien
,
Kshitij Gupta
,
Mats Leon Richter
,
Quentin Gregory Anthony
,
Eugene Belilovsky
,
Timothée Lesort
,
Irina Rish
ICMLW
2023
Continual Pre-Training of Large Language Models: How to Re-Warm Your Model?
Kshitij Gupta
,
Benjamin Thérien
,
Adam Ibrahim
,
Mats Leon Richter
,
Quentin Gregory Anthony
,
Eugene Belilovsky
,
Irina Rish
,
Timothée Lesort
NeurIPSW
2023
Learning Optimizers for Local SGD
Charles-Étienne Joseph
,
Benjamin Thérien
,
Abhinav Moudgil
,
Boris Knyazev
,
Eugene Belilovsky
CVPR
2022
Parametric Scattering Networks
Shanel Gauthier
,
Benjamin Thérien
,
Laurent Alsène-Racicot
,
Muawiz Chaudhary
,
Irina Rish
,
Eugene Belilovsky
,
Michael Eickenberg
,
Guy Wolf