Value-Evolutionary-Based Reinforcement Learning

Abstract

Combining Evolutionary Algorithms (EAs) and Reinforcement Learning (RL) for policy search has been proven to improve RL performance. However, previous works largely overlook value-based RL in favor of merging EAs with policy-based RL. This paper introduces Value-Evolutionary-Based Reinforcement Learning (VEB-RL) that focuses on the integration of EAs with value-based RL. The framework maintains a population of value functions instead of policies and leverages negative Temporal Difference error as the fitness metric for evolution. The metric is more sample-efficient for population evaluation than cumulative rewards and is closely associated with the accuracy of the value function approximation. Additionally, VEB-RL enables elites of the population to interact with the environment to offer high-quality samples for RL optimization, whereas the RL value function participates in the population’s evolution in each generation. Experiments on MinAtar and Atari demonstrate the superiority of VEB-RL in significantly improving DQN, Rainbow, and SPR. Our code is available on https://github.com/yeshenpy/VEB-RL.

Cite

Text

Li et al. "Value-Evolutionary-Based Reinforcement Learning." International Conference on Machine Learning, 2024.

Markdown

[Li et al. "Value-Evolutionary-Based Reinforcement Learning." International Conference on Machine Learning, 2024.](https://mlanthology.org/icml/2024/li2024icml-valueevolutionarybased/)

BibTeX

@inproceedings{li2024icml-valueevolutionarybased,
  title     = {{Value-Evolutionary-Based Reinforcement Learning}},
  author    = {Li, Pengyi and Hao, Jianye and Tang, Hongyao and Zheng, Yan and Barez, Fazl},
  booktitle = {International Conference on Machine Learning},
  year      = {2024},
  pages     = {27875-27889},
  volume    = {235},
  url       = {https://mlanthology.org/icml/2024/li2024icml-valueevolutionarybased/}
}