The Limits of Transfer Reinforcement Learning with Latent Low-Rank Structure

Abstract

Many reinforcement learning (RL) algorithms are too costly to use in practice due to the large sizes $S,A$ of the problem's state and action space. To resolve this issue, we study transfer RL with latent low rank structure. We consider the problem of transferring a latent low rank representation when the source and target MDPs have transition kernels with Tucker rank $(S, d, A)$, $(S ,S , d), (d, S , A )$, or $(d , d , d )$. In each setting, we introduce the transfer-ability coefficient $\alpha$ that measures the difficulty of representational transfer. Our algorithm learns latent representations in each source MDP and then exploits the linear structure to remove the dependence on $S , A $, or $SA $ in the target MDP regret bound. We complement our positive results with information theoretic lower bounds that show our algorithms (excluding the ($d, d, d$) setting) are minimax-optimal with respect to $\alpha$.

Cite

Text

Sam et al. "The Limits of Transfer Reinforcement Learning with Latent Low-Rank Structure." Neural Information Processing Systems, 2024. doi:10.52202/079017-3438

Markdown

[Sam et al. "The Limits of Transfer Reinforcement Learning with Latent Low-Rank Structure." Neural Information Processing Systems, 2024.](https://mlanthology.org/neurips/2024/sam2024neurips-limits/) doi:10.52202/079017-3438

BibTeX

@inproceedings{sam2024neurips-limits,
  title     = {{The Limits of Transfer Reinforcement Learning with Latent Low-Rank Structure}},
  author    = {Sam, Tyler and Chen, Yudong and Yu, Christina Lee},
  booktitle = {Neural Information Processing Systems},
  year      = {2024},
  doi       = {10.52202/079017-3438},
  url       = {https://mlanthology.org/neurips/2024/sam2024neurips-limits/}
}