Neighborhood Mixup Experience Replay: Local Convex Interpolation for Improved Sample Efficiency in Continuous Control Tasks

Ryan Sander, Wilko Schwarting, Tim Seyde, Igor Gilitschenski, Sertac Karaman, Daniela Rus

NeurIPSW 2021

/neuripsw/2021/sander2021neuripsw-neighborhood/

Abstract

Experience replay plays a crucial role in improving the sample efficiency of deep reinforcement learning agents. Recent advances in experience replay propose the use of Mixup [35] to further improve sample efficiency via synthetic sample generation. We build upon this idea with Neighborhood Mixup Experience Replay (NMER), a modular replay buffer that interpolates transitions with their closest neighbors in normalized state-action space. NMER preserves a locally linear approximation of the transition manifold by only performing Mixup between transitions with similar state-action features. Under NMER, a given transition’s set of state-action neighbors is dynamic and episode agnostic, in turn encouraging greater policy generalizability via cross-episode interpolation. We combine our approach with recent off-policy deep reinforcement learning algorithms and evaluate on several continuous control environments. We observe that NMER improves sample efficiency by an average 87% (TD3) and 29% (SAC) over baseline replay buffers, enabling agents to effectively recombine previous experiences and learn from limited data.

PDF NeurIPSW OpenReview Semantic Scholar

Cite

Text

Sander et al. "Neighborhood Mixup Experience Replay: Local Convex Interpolation for Improved Sample Efficiency in Continuous Control Tasks." NeurIPS 2021 Workshops: DeepRL, 2021.

Markdown

[Sander et al. "Neighborhood Mixup Experience Replay: Local Convex Interpolation for Improved Sample Efficiency in Continuous Control Tasks." NeurIPS 2021 Workshops: DeepRL, 2021.](https://mlanthology.org/neuripsw/2021/sander2021neuripsw-neighborhood/)

BibTeX

@inproceedings{sander2021neuripsw-neighborhood,
  title     = {{Neighborhood Mixup Experience Replay: Local Convex Interpolation for Improved Sample Efficiency in Continuous Control Tasks}},
  author    = {Sander, Ryan and Schwarting, Wilko and Seyde, Tim and Gilitschenski, Igor and Karaman, Sertac and Rus, Daniela},
  booktitle = {NeurIPS 2021 Workshops: DeepRL},
  year      = {2021},
  url       = {https://mlanthology.org/neuripsw/2021/sander2021neuripsw-neighborhood/}
}