Generalized Value Functions for Large Action Sets
Abstract
The majority of value function approximation based reinforcement learning algorithms available today, focus on approximating the state (V) or state-action (Q) value function and efficient action selection comes as an afterthought. On the other hand, real-world problems tend to have large action spaces, where evaluating every possible action becomes impractical. This mismatch presents a major obstacle in successfully applying reinforcement learning to real-world problems. In this paper we present a unified view of V and Q functions and arrive at a new space-efficient representation, where action selection can be done exponentially faster, without the use of a model. We then describe how to calculate this new value function efficiently via approximate linear programming and provide experimental results that demonstrate the effectiveness of the proposed approach.
Cite
Text
Pazis and Parr. "Generalized Value Functions for Large Action Sets." International Conference on Machine Learning, 2011.Markdown
[Pazis and Parr. "Generalized Value Functions for Large Action Sets." International Conference on Machine Learning, 2011.](https://mlanthology.org/icml/2011/pazis2011icml-generalized/)BibTeX
@inproceedings{pazis2011icml-generalized,
title = {{Generalized Value Functions for Large Action Sets}},
author = {Pazis, Jason and Parr, Ronald},
booktitle = {International Conference on Machine Learning},
year = {2011},
pages = {1185-1192},
url = {https://mlanthology.org/icml/2011/pazis2011icml-generalized/}
}