Momentum Boosted Episodic Memory for Improving Learning in Long-Tailed RL Environments
Abstract
Conventional Reinforcement Learning (RL) algorithms assume the distribution of the data to be uniform or mostly uniform. However, this is not the case with most real-world applications like autonomous driving or in nature, where animals roam. Some objects are encountered frequently, and most of the remaining experiences occur rarely; the resulting distribution is called Zipfian. Taking inspiration from the theory of complementary learning systems, an architecture for learning from Zipfian distributions is proposed where long tail states are discovered in an unsupervised manner and states along with their recurrent activation are kept longer in episodic memory. The recurrent activations are then reinstated from episodic memory using a similarity search, giving weighted importance. The proposed architecture yields improved performance in a Zipfian task over conventional architectures. Our method outperforms IMPALA by a significant margin of 20.3% when maps/objects occur with a uniform distribution and by 50.2% on the rarest 20% of the distribution.
Cite
Text
Fernandes et al. "Momentum Boosted Episodic Memory for Improving Learning in Long-Tailed RL Environments." NeurIPS 2022 Workshops: DeepRL, 2022.Markdown
[Fernandes et al. "Momentum Boosted Episodic Memory for Improving Learning in Long-Tailed RL Environments." NeurIPS 2022 Workshops: DeepRL, 2022.](https://mlanthology.org/neuripsw/2022/fernandes2022neuripsw-momentum/)BibTeX
@inproceedings{fernandes2022neuripsw-momentum,
title = {{Momentum Boosted Episodic Memory for Improving Learning in Long-Tailed RL Environments}},
author = {Fernandes, Dolton Milagres and Kaushik, Pramod and Shukla, Harsh and Surampudi, Bapi Raju},
booktitle = {NeurIPS 2022 Workshops: DeepRL},
year = {2022},
url = {https://mlanthology.org/neuripsw/2022/fernandes2022neuripsw-momentum/}
}