Exponential Lower Bounds for Planning in MDPs with Linearly-Realizable Optimal Action-Value Functions
Abstract
We consider the problem of local planning in fixed-horizon and discounted Markov Decision Processes (MDPs) with linear function approximation and a generative model under the assumption that the optimal action-value function lies in the span of a feature map that is available to the planner. Previous work has left open the question of whether there exist sound planners that need only $\mbox{poly}(H,d)$ queries regardless of the MDP, where $H$ is the horizon and $d$ is the dimensionality of the features. We answer this question in the negative: we show that any sound planner must query at least $\min(e^{\Omega(d)},\Omega(2^H))$ samples in the fized-horizon setting and $e^{\Omega(d)}$ samples in the discounted setting. We also show that for any $\delta>0$, the least-squares value iteration algorithm with $\tilde{\mathcal{O}}(H^5 d^{H+1}/\delta^2)$ queries can compute a $\delta$-optimal policy in the fixed-horizon setting. We discuss implications and remaining open questions.
Cite
Text
Weisz et al. "Exponential Lower Bounds for Planning in MDPs with Linearly-Realizable Optimal Action-Value Functions." Proceedings of the 32nd International Conference on Algorithmic Learning Theory, 2021.Markdown
[Weisz et al. "Exponential Lower Bounds for Planning in MDPs with Linearly-Realizable Optimal Action-Value Functions." Proceedings of the 32nd International Conference on Algorithmic Learning Theory, 2021.](https://mlanthology.org/alt/2021/weisz2021alt-exponential/)BibTeX
@inproceedings{weisz2021alt-exponential,
title = {{Exponential Lower Bounds for Planning in MDPs with Linearly-Realizable Optimal Action-Value Functions}},
author = {Weisz, Gellért and Amortila, Philip and Szepesvári, Csaba},
booktitle = {Proceedings of the 32nd International Conference on Algorithmic Learning Theory},
year = {2021},
pages = {1237-1264},
volume = {132},
url = {https://mlanthology.org/alt/2021/weisz2021alt-exponential/}
}