Information-Theoretic Generalization Bounds for Batch Reinforcement Learning

Abstract

We analyze the generalization properties of batch reinforcement learning (batch RL) with value function approximation from an information-theoretic perspective. We derive generalization bounds for batch RL using (conditional) mutual information. In addition, we demonstrate how to establish a connection between certain structural assumptions on the value function space and conditional mutual information. As a by-product, we derive a \textit{high-probability} generalization bound via conditional mutual information, which was left open in \cite{steinke2020reasoning} and may be of independent interest.

Cite

Text

Liu. "Information-Theoretic Generalization Bounds for Batch Reinforcement Learning." NeurIPS 2024 Workshops: M3L, 2024.

Markdown

[Liu. "Information-Theoretic Generalization Bounds for Batch Reinforcement Learning." NeurIPS 2024 Workshops: M3L, 2024.](https://mlanthology.org/neuripsw/2024/liu2024neuripsw-informationtheoretic/)

BibTeX

@inproceedings{liu2024neuripsw-informationtheoretic,
  title     = {{Information-Theoretic Generalization Bounds for Batch Reinforcement Learning}},
  author    = {Liu, Xingtu},
  booktitle = {NeurIPS 2024 Workshops: M3L},
  year      = {2024},
  url       = {https://mlanthology.org/neuripsw/2024/liu2024neuripsw-informationtheoretic/}
}