Generalized Value Functions for Large Action Sets

Abstract

The majority of value function approximation based reinforcement learning algorithms available today, focus on approximating the state (V) or state-action (Q) value function and efficient action selection comes as an afterthought. On the other hand, real-world problems tend to have large action spaces, where evaluating every possible action becomes impractical. This mismatch presents a major obstacle in successfully applying reinforcement learning to real-world problems. In this paper we present a unified view of V and Q functions and arrive at a new space-efficient representation, where action selection can be done exponentially faster, without the use of a model. We then describe how to calculate this new value function efficiently via approximate linear programming and provide experimental results that demonstrate the effectiveness of the proposed approach.

Cite

Text

Pazis and Parr. "Generalized Value Functions for Large Action Sets." International Conference on Machine Learning, 2011.

Markdown

[Pazis and Parr. "Generalized Value Functions for Large Action Sets." International Conference on Machine Learning, 2011.](https://mlanthology.org/icml/2011/pazis2011icml-generalized/)

BibTeX

@inproceedings{pazis2011icml-generalized,
  title     = {{Generalized Value Functions for Large Action Sets}},
  author    = {Pazis, Jason and Parr, Ronald},
  booktitle = {International Conference on Machine Learning},
  year      = {2011},
  pages     = {1185-1192},
  url       = {https://mlanthology.org/icml/2011/pazis2011icml-generalized/}
}