ERLP: Ensembles of Reinforcement Learning Policies (Student Abstract)
Abstract
Reinforcement learning algorithms are sensitive to hyper-parameters and require tuning and tweaking for specific environments for improving performance. Ensembles of reinforcement learning models on the other hand are known to be much more robust and stable. However, training multiple models independently on an environment suffers from high sample complexity. We present here a methodology to create multiple models from a single training instance that can be used in an ensemble through directed perturbation of the model parameters at regular intervals. This allows training a single model that converges to several local minima during the optimization process as a result of the perturbation. By saving the model parameters at each such instance, we obtain multiple policies during training that are ensembled during evaluation. We evaluate our approach on challenging discrete and continuous control tasks and also discuss various ensembling strategies. Our framework is substantially sample efficient, computationally inexpensive and is seen to outperform state of the art (SOTA) approaches
Cite
Text
Saphal et al. "ERLP: Ensembles of Reinforcement Learning Policies (Student Abstract)." AAAI Conference on Artificial Intelligence, 2020. doi:10.1609/AAAI.V34I10.7225Markdown
[Saphal et al. "ERLP: Ensembles of Reinforcement Learning Policies (Student Abstract)." AAAI Conference on Artificial Intelligence, 2020.](https://mlanthology.org/aaai/2020/saphal2020aaai-erlp/) doi:10.1609/AAAI.V34I10.7225BibTeX
@inproceedings{saphal2020aaai-erlp,
title = {{ERLP: Ensembles of Reinforcement Learning Policies (Student Abstract)}},
author = {Saphal, Rohan and Ravindran, Balaraman and Mudigere, Dheevatsa and Avancha, Sasikanth and Kaul, Bharat},
booktitle = {AAAI Conference on Artificial Intelligence},
year = {2020},
pages = {13905-13906},
doi = {10.1609/AAAI.V34I10.7225},
url = {https://mlanthology.org/aaai/2020/saphal2020aaai-erlp/}
}