Multi-User Reinforcement Learning with Low Rank Rewards

Dheeraj Mysore Nagaraj, Suhas S Kowshik, Naman Agarwal, Praneeth Netrapalli, Prateek Jain

ICML 2023 pp. 25627-25659

/icml/2023/nagaraj2023icml-multiuser/

Abstract

We consider collaborative multi-user reinforcement learning, where multiple users have the same state-action space and transition probabilities but different rewards. Under the assumption that the reward matrix of the $N$ users has a low-rank structure – a standard and practically successful assumption in the collaborative filtering setting – we design algorithms with significantly lower sample complexity compared to the ones that learn the MDP individually for each user. Our main contribution is an algorithm which explores rewards collaboratively with $N$ user-specific MDPs and can learn rewards efficiently in two key settings: tabular MDPs and linear MDPs. When $N$ is large and the rank is constant, the sample complexity per MDP depends logarithmically over the size of the state-space, which represents an exponential reduction (in the state-space size) when compared to the standard “non-collaborative” algorithms. Our main technical contribution is a method to construct policies which obtain data such that low rank matrix completion is possible (without a generative model). This goes beyond the regular RL framework and is closely related to mean field limits of multi-agent RL.

PDF ICML OpenReview Semantic Scholar

Cite

Text

Nagaraj et al. "Multi-User Reinforcement Learning with Low Rank Rewards." International Conference on Machine Learning, 2023.

Markdown

[Nagaraj et al. "Multi-User Reinforcement Learning with Low Rank Rewards." International Conference on Machine Learning, 2023.](https://mlanthology.org/icml/2023/nagaraj2023icml-multiuser/)

BibTeX

@inproceedings{nagaraj2023icml-multiuser,
  title     = {{Multi-User Reinforcement Learning with Low Rank Rewards}},
  author    = {Nagaraj, Dheeraj Mysore and Kowshik, Suhas S and Agarwal, Naman and Netrapalli, Praneeth and Jain, Prateek},
  booktitle = {International Conference on Machine Learning},
  year      = {2023},
  pages     = {25627-25659},
  volume    = {202},
  url       = {https://mlanthology.org/icml/2023/nagaraj2023icml-multiuser/}
}