Provably Efficient Representation Selection in Low-Rank Markov Decision Processes: From Online to Offline RL

Zhang, W.; He, J.; Zhou, D.; Gu, Q.; Zhang, A.

Provably Efficient Representation Selection in Low-Rank Markov Decision Processes: From Online to Offline RL

W. Zhang, J. He, D. Zhou, Q. Gu, A. Zhang

UAI 2023 pp. 2488-2497

/uai/2023/zhang2023uai-provably/

Abstract

The success of deep reinforcement learning (DRL) lies in its ability to learn a representation that is well-suited for the exploration and exploitation task. To understand how the choice of representation can improve the efficiency of reinforcement learning (RL), we study representation selection for a class of low-rank Markov Decision Processes (MDPs) where the transition kernel can be represented in a bilinear form. We propose an efficient algorithm, called ReLEX, for representation learning in both online and offline RL. Specifically, we show that the online version of ReLEX, called ReLEX-UCB, always performs no worse than the state-of-the-art algorithm without representation selection, and achieves a strictly better constant regret if the representation function class has a "coverage" property over the entire state-action space. For the offline counterpart, ReLEX-LCB, we show that the algorithm can find the optimal policy if the representation class can cover the state-action space and achieves gap-dependent sample complexity. This is the first result with constant sample complexity for representation learning in offline RL.

PDF UAI OpenReview Semantic Scholar

Cite

Text

Zhang et al. "Provably Efficient Representation Selection in Low-Rank Markov Decision Processes: From Online to Offline RL." Uncertainty in Artificial Intelligence, 2023.

Markdown

[Zhang et al. "Provably Efficient Representation Selection in Low-Rank Markov Decision Processes: From Online to Offline RL." Uncertainty in Artificial Intelligence, 2023.](https://mlanthology.org/uai/2023/zhang2023uai-provably/)

BibTeX

@inproceedings{zhang2023uai-provably,
  title     = {{Provably Efficient Representation Selection in Low-Rank Markov Decision Processes: From Online to Offline RL}},
  author    = {Zhang, W. and He, J. and Zhou, D. and Gu, Q. and Zhang, A.},
  booktitle = {Uncertainty in Artificial Intelligence},
  year      = {2023},
  pages     = {2488-2497},
  volume    = {216},
  url       = {https://mlanthology.org/uai/2023/zhang2023uai-provably/}
}