On the Statistical Complexity of Offline Decision-Making
Abstract
We study the statistical complexity of offline decision-making with function approximation, establishing (near) minimax-optimal rates for stochastic contextual bandits and Markov decision processes. The performance limits are captured by the pseudo-dimension of the (value) function class and a new characterization of the behavior policy that strictly subsumes all the previous notions of data coverage in the offline decision-making literature. In addition, we seek to understand the benefits of using offline data in online decision-making and show nearly minimax-optimal rates in a wide range of regimes.
Cite
Text
Nguyen-Tang and Arora. "On the Statistical Complexity of Offline Decision-Making." International Conference on Machine Learning, 2024.Markdown
[Nguyen-Tang and Arora. "On the Statistical Complexity of Offline Decision-Making." International Conference on Machine Learning, 2024.](https://mlanthology.org/icml/2024/nguyentang2024icml-statistical/)BibTeX
@inproceedings{nguyentang2024icml-statistical,
title = {{On the Statistical Complexity of Offline Decision-Making}},
author = {Nguyen-Tang, Thanh and Arora, Raman},
booktitle = {International Conference on Machine Learning},
year = {2024},
pages = {37900-37928},
volume = {235},
url = {https://mlanthology.org/icml/2024/nguyentang2024icml-statistical/}
}