Mixtures of Experts Unlock Parameter Scaling for Deep RL
Abstract
The recent rapid progress in (self) supervised learning models is in large part predicted by empirical scaling laws: a model’s performance scales proportionally to its size. Analogous scaling laws remain elusive for reinforcement learning domains, however, where increasing the parameter count of a model often hurts its final performance. In this paper, we demonstrate that incorporating Mixture-of-Expert (MoE) modules, and in particular Soft MoEs (Puigcerver et al., 2023), into value-based networks results in more parameter-scalable models, evidenced by substantial performance increases across a variety of training regimes and model sizes. This work thus provides strong empirical evidence towards developing scaling laws for reinforcement learning.
Cite
Text
Obando Ceron et al. "Mixtures of Experts Unlock Parameter Scaling for Deep RL." International Conference on Machine Learning, 2024.Markdown
[Obando Ceron et al. "Mixtures of Experts Unlock Parameter Scaling for Deep RL." International Conference on Machine Learning, 2024.](https://mlanthology.org/icml/2024/obandoceron2024icml-mixtures/)BibTeX
@inproceedings{obandoceron2024icml-mixtures,
title = {{Mixtures of Experts Unlock Parameter Scaling for Deep RL}},
author = {Obando Ceron, Johan Samir and Sokar, Ghada and Willi, Timon and Lyle, Clare and Farebrother, Jesse and Foerster, Jakob Nicolaus and Dziugaite, Gintare Karolina and Precup, Doina and Castro, Pablo Samuel},
booktitle = {International Conference on Machine Learning},
year = {2024},
pages = {38520-38540},
volume = {235},
url = {https://mlanthology.org/icml/2024/obandoceron2024icml-mixtures/}
}