On Overfitting and Asymptotic Bias in Batch Reinforcement Learning with Partial Observability (Extended Abstract)

Vincent François-Lavet, Guillaume Rabusseau, Joelle Pineau, Damien Ernst, Raphael Fonteneau

IJCAI 2020 pp. 5055-5059

doi:10.24963/IJCAI.2020/706 /ijcai/2020/francoislavet2020ijcai-overfitting/

Abstract

When an agent has limited information on its environment, the suboptimality of an RL algorithm can be decomposed into the sum of two terms: a term related to an asymptotic bias (suboptimality with unlimited data) and a term due to overfitting (additional suboptimality due to limited data). In the context of reinforcement learning with partial observability, this paper provides an analysis of the tradeoff between these two error sources. In particular, our theoretical analysis formally characterizes how a smaller state representation increases the asymptotic bias while decreasing the risk of overfitting.

PDF IJCAI Semantic Scholar

Cite

Text

François-Lavet et al. "On Overfitting and Asymptotic Bias in Batch Reinforcement Learning with Partial Observability (Extended Abstract)." International Joint Conference on Artificial Intelligence, 2020. doi:10.24963/IJCAI.2020/706

Markdown

[François-Lavet et al. "On Overfitting and Asymptotic Bias in Batch Reinforcement Learning with Partial Observability (Extended Abstract)." International Joint Conference on Artificial Intelligence, 2020.](https://mlanthology.org/ijcai/2020/francoislavet2020ijcai-overfitting/) doi:10.24963/IJCAI.2020/706

BibTeX

@inproceedings{francoislavet2020ijcai-overfitting,
  title     = {{On Overfitting and Asymptotic Bias in Batch Reinforcement Learning with Partial Observability (Extended Abstract)}},
  author    = {François-Lavet, Vincent and Rabusseau, Guillaume and Pineau, Joelle and Ernst, Damien and Fonteneau, Raphael},
  booktitle = {International Joint Conference on Artificial Intelligence},
  year      = {2020},
  pages     = {5055-5059},
  doi       = {10.24963/IJCAI.2020/706},
  url       = {https://mlanthology.org/ijcai/2020/francoislavet2020ijcai-overfitting/}
}