Mahmood, Rupam

3 publications

AISTATS 2022 An Alternate Policy Gradient Estimator for SoftMax Policies Shivam Garg, Samuele Tosatto, Yangchen Pan, Martha White, Rupam Mahmood
AISTATS 2022 Model-Free Policy Learning with Reward Gradients Qingfeng Lan, Samuele Tosatto, Homayoon Farrahi, Rupam Mahmood
ICML 2022 A Temporal-Difference Approach to Policy Gradient Estimation Samuele Tosatto, Andrew Patterson, Martha White, Rupam Mahmood