Representation-Driven Reinforcement Learning

Abstract

We present a representation-driven framework for reinforcement learning. By representing policies as estimates of their expected values, we leverage techniques from contextual bandits to guide exploration and exploitation. Particularly, embedding a policy network into a linear feature space allows us to reframe the exploration-exploitation problem as a representation-exploitation problem, where good policy representations enable optimal exploration. We demonstrate the effectiveness of this framework through its application to evolutionary and policy gradient-based approaches, leading to significantly improved performance compared to traditional methods. Our framework provides a new perspective on reinforcement learning, highlighting the importance of policy representation in determining optimal exploration-exploitation strategies.

Cite

Text

Nabati et al. "Representation-Driven Reinforcement Learning." International Conference on Machine Learning, 2023.

Markdown

[Nabati et al. "Representation-Driven Reinforcement Learning." International Conference on Machine Learning, 2023.](https://mlanthology.org/icml/2023/nabati2023icml-representationdriven/)

BibTeX

@inproceedings{nabati2023icml-representationdriven,
  title     = {{Representation-Driven Reinforcement Learning}},
  author    = {Nabati, Ofir and Tennenholtz, Guy and Mannor, Shie},
  booktitle = {International Conference on Machine Learning},
  year      = {2023},
  pages     = {25588-25603},
  volume    = {202},
  url       = {https://mlanthology.org/icml/2023/nabati2023icml-representationdriven/}
}