Exploiting Best-Match Equations for Efficient Reinforcement Learning
Abstract
This article presents and evaluates best-match learning, a new approach to reinforcement learning that trades off the sample efficiency of model-based methods with the space efficiency of model-free methods. Best-match learning works by approximating the solution to a set of best-match equations, which combine a sparse model with a model-free Q-value function constructed from samples not used by the model. We prove that, unlike regular sparse model-based methods, best-match learning is guaranteed to converge to the optimal Q-values in the tabular case. Empirical results demonstrate that best-match learning can substantially outperform regular sparse model-based methods, as well as several model-free methods that strive to improve the sample efficiency of temporal-difference methods. In addition, we demonstrate that best-match learning can be successfully combined with function approximation.
Cite
Text
van Seijen et al. "Exploiting Best-Match Equations for Efficient Reinforcement Learning." Journal of Machine Learning Research, 2011.Markdown
[van Seijen et al. "Exploiting Best-Match Equations for Efficient Reinforcement Learning." Journal of Machine Learning Research, 2011.](https://mlanthology.org/jmlr/2011/vanseijen2011jmlr-exploiting/)BibTeX
@article{vanseijen2011jmlr-exploiting,
title = {{Exploiting Best-Match Equations for Efficient Reinforcement Learning}},
author = {van Seijen, Harm and Whiteson, Shimon and van Hasselt, Hado and Wiering, Marco},
journal = {Journal of Machine Learning Research},
year = {2011},
pages = {2045-2094},
volume = {12},
url = {https://mlanthology.org/jmlr/2011/vanseijen2011jmlr-exploiting/}
}