On the Statistical Complexity of Offline Decision-Making

Abstract

We study the statistical complexity of offline decision-making with function approximation, establishing (near) minimax-optimal rates for stochastic contextual bandits and Markov decision processes. The performance limits are captured by the pseudo-dimension of the (value) function class and a new characterization of the behavior policy that strictly subsumes all the previous notions of data coverage in the offline decision-making literature. In addition, we seek to understand the benefits of using offline data in online decision-making and show nearly minimax-optimal rates in a wide range of regimes.

Cite

Text

Nguyen-Tang and Arora. "On the Statistical Complexity of Offline Decision-Making." International Conference on Machine Learning, 2024.

Markdown

[Nguyen-Tang and Arora. "On the Statistical Complexity of Offline Decision-Making." International Conference on Machine Learning, 2024.](https://mlanthology.org/icml/2024/nguyentang2024icml-statistical/)

BibTeX

@inproceedings{nguyentang2024icml-statistical,
  title     = {{On the Statistical Complexity of Offline Decision-Making}},
  author    = {Nguyen-Tang, Thanh and Arora, Raman},
  booktitle = {International Conference on Machine Learning},
  year      = {2024},
  pages     = {37900-37928},
  volume    = {235},
  url       = {https://mlanthology.org/icml/2024/nguyentang2024icml-statistical/}
}