Deo, Pranav

1 publications

TMLR 2023 Offline Reinforcement Learning with Mixture of Deterministic Policies Takayuki Osa, Akinobu Hayashi, Pranav Deo, Naoki Morihira, Takahide Yoshiike