Finite Sample Analysis of LSTD with Random Projections and Eligibility Traces
Abstract
Policy evaluation with linear function approximation is an important problem in reinforcement learning. When facing high-dimensional feature spaces, such a problem becomes extremely hard considering the computation efficiency and quality of approximations. We propose a new algorithm, LSTD(lambda)-RP, which leverages random projection techniques and takes eligibility traces into consideration to tackle the above two challenges. We carry out theoretical analysis of LSTD(lambda)-RP, and provide meaningful upper bounds of the estimation error, approximation error and total generalization error. These results demonstrate that LSTD(lambda)-RP can benefit from random projection and eligibility traces strategies, and LSTD(lambda)-RP can achieve better performances than prior LSTD-RP and LSTD(lambda) algorithms.
Cite
Text
Li et al. "Finite Sample Analysis of LSTD with Random Projections and Eligibility Traces." International Joint Conference on Artificial Intelligence, 2018. doi:10.24963/IJCAI.2018/331Markdown
[Li et al. "Finite Sample Analysis of LSTD with Random Projections and Eligibility Traces." International Joint Conference on Artificial Intelligence, 2018.](https://mlanthology.org/ijcai/2018/li2018ijcai-finite/) doi:10.24963/IJCAI.2018/331BibTeX
@inproceedings{li2018ijcai-finite,
title = {{Finite Sample Analysis of LSTD with Random Projections and Eligibility Traces}},
author = {Li, Haifang and Xia, Yingce and Zhang, Wensheng},
booktitle = {International Joint Conference on Artificial Intelligence},
year = {2018},
pages = {2390-2396},
doi = {10.24963/IJCAI.2018/331},
url = {https://mlanthology.org/ijcai/2018/li2018ijcai-finite/}
}