The Curse of Diversity in Ensemble-Based Exploration
Abstract
We uncover a surprising phenomenon in deep reinforcement learning: training a diverse ensemble of data-sharing agents -- a well-established exploration strategy -- can significantly impair the performance of the individual ensemble members when compared to standard single-agent training. Through careful analysis, we attribute the degradation in performance to the low proportion of self-generated data in the shared training data for each ensemble member, as well as the inefficiency of the individual ensemble members to learn from such highly off-policy data. We thus name this phenomenon *the curse of diversity*. We find that several intuitive solutions -- such as a larger replay buffer or a smaller ensemble size -- either fail to consistently mitigate the performance loss or undermine the advantages of ensembling. Finally, we demonstrate the potential of representation learning to counteract the curse of diversity with a novel method named Cross-Ensemble Representation Learning (CERL) in both discrete and continuous control domains. Our work offers valuable insights into an unexpected pitfall in ensemble-based exploration and raises important caveats for future applications of similar approaches.
Cite
Text
Lin et al. "The Curse of Diversity in Ensemble-Based Exploration." International Conference on Learning Representations, 2024.Markdown
[Lin et al. "The Curse of Diversity in Ensemble-Based Exploration." International Conference on Learning Representations, 2024.](https://mlanthology.org/iclr/2024/lin2024iclr-curse/)BibTeX
@inproceedings{lin2024iclr-curse,
title = {{The Curse of Diversity in Ensemble-Based Exploration}},
author = {Lin, Zhixuan and D'Oro, Pierluca and Nikishin, Evgenii and Courville, Aaron},
booktitle = {International Conference on Learning Representations},
year = {2024},
url = {https://mlanthology.org/iclr/2024/lin2024iclr-curse/}
}