Open Problem: The Dependence of Sample Complexity Lower Bounds on Planning Horizon

Abstract

In reinforcement learning (RL), problems with long planning horizons are perceived as very challenging. The recent advances in PAC RL, however, show that the sample complexity of RL does not depend on planning horizon except at a superficial level. How can we explain such a difference? Noting that the technical assumptions in these upper bounds might have hidden away the challenges of long horizons, we ask the question: \emph{can we prove a lower bound with a horizon dependence when such assumptions are removed?} We also provide a few observations on the desired characteristics of the lower bound construction.

Cite

Text

Jiang and Agarwal. "Open Problem: The Dependence of Sample Complexity Lower Bounds on Planning Horizon." Annual Conference on Computational Learning Theory, 2018.

Markdown

[Jiang and Agarwal. "Open Problem: The Dependence of Sample Complexity Lower Bounds on Planning Horizon." Annual Conference on Computational Learning Theory, 2018.](https://mlanthology.org/colt/2018/jiang2018colt-open/)

BibTeX

@inproceedings{jiang2018colt-open,
  title     = {{Open Problem: The Dependence of Sample Complexity Lower Bounds on Planning Horizon}},
  author    = {Jiang, Nan and Agarwal, Alekh},
  booktitle = {Annual Conference on Computational Learning Theory},
  year      = {2018},
  pages     = {3395-3398},
  url       = {https://mlanthology.org/colt/2018/jiang2018colt-open/}
}