Don't Flatten, Tokenize! Unlocking the Key to SoftMoE's Efficacy in Deep RL
Abstract
The use of deep neural networks in reinforcement learning (RL) often suffers from performance degradation as model size increases. While soft mixtures of experts (SoftMoEs) have recently shown promise in mitigating this issue for online RL, the reasons behind their effectiveness remain largely unknown. In this work we provide an in-depth analysis identifying the key factors driving this performance gain. We discover the surprising result that tokenizing the encoder output, rather than the use of multiple experts, is what is behind the efficacy of SoftMoEs. Indeed, we demonstrate that even with an appropriately scaled single expert, we are able to maintain the performance gains, largely thanks to tokenization.
Cite
Text
Sokar et al. "Don't Flatten, Tokenize! Unlocking the Key to SoftMoE's Efficacy in Deep RL." International Conference on Learning Representations, 2025.Markdown
[Sokar et al. "Don't Flatten, Tokenize! Unlocking the Key to SoftMoE's Efficacy in Deep RL." International Conference on Learning Representations, 2025.](https://mlanthology.org/iclr/2025/sokar2025iclr-don/)BibTeX
@inproceedings{sokar2025iclr-don,
title = {{Don't Flatten, Tokenize! Unlocking the Key to SoftMoE's Efficacy in Deep RL}},
author = {Sokar, Ghada and Ceron, Johan Samir Obando and Courville, Aaron and Larochelle, Hugo and Castro, Pablo Samuel},
booktitle = {International Conference on Learning Representations},
year = {2025},
url = {https://mlanthology.org/iclr/2025/sokar2025iclr-don/}
}