Sample-Efficient Cross-Entropy Method for Real-Time Planning

Abstract

Trajectory optimizers for model-based reinforcement learning, such as the Cross-Entropy Method (CEM), can yield compelling results even in high-dimensional control tasks and sparse-reward environments. However, their sampling inefficiency prevents them from being used for real-time planning and control. We propose an improved version of the CEM algorithm for fast planning, with novel additions including temporally-correlated actions and memory, requiring 2.7-22x less samples and yielding a performance increase of 1.2-10x in high-dimensional control problems.

Cite

Text

Pinneri et al. "Sample-Efficient Cross-Entropy Method for Real-Time Planning." Conference on Robot Learning, 2020.

Markdown

[Pinneri et al. "Sample-Efficient Cross-Entropy Method for Real-Time Planning." Conference on Robot Learning, 2020.](https://mlanthology.org/corl/2020/pinneri2020corl-sampleefficient/)

BibTeX

@inproceedings{pinneri2020corl-sampleefficient,
  title     = {{Sample-Efficient Cross-Entropy Method for Real-Time Planning}},
  author    = {Pinneri, Cristina and Sawant, Shambhuraj and Blaes, Sebastian and Achterhold, Jan and Stueckler, Joerg and Rolinek, Michal and Martius, Georg},
  booktitle = {Conference on Robot Learning},
  year      = {2020},
  pages     = {1049-1065},
  volume    = {155},
  url       = {https://mlanthology.org/corl/2020/pinneri2020corl-sampleefficient/}
}