Mahmood, A. Rupam

19 publications

ICLR 2024 Addressing Loss of Plasticity and Catastrophic Forgetting in Continual Learning Mohamed Elsayed, A. Rupam Mahmood
NeurIPS 2024 Deep Policy Gradient Methods Without Batch Updates, Target Networks, or Replay Buffers Gautham Vasan, Mohamed Elsayed, Alireza Azimi, Jiamin He, Fahim Shariar, Colin Bellinger, Martha White, A. Rupam Mahmood
NeurIPSW 2024 Deep Reinforcement Learning Without Experience Replay, Target Networks, or Batch Updates Mohamed Elsayed, Gautham Vasan, A. Rupam Mahmood
ICLR 2024 Provable and Practical: Efficient Exploration in Reinforcement Learning via Langevin Monte Carlo Haque Ishfaq, Qingfeng Lan, Pan Xu, A. Rupam Mahmood, Doina Precup, Anima Anandkumar, Kamyar Azizzadenesheli
ICML 2024 Revisiting Scalable Hessian Diagonal Approximations for Applications in Reinforcement Learning Mohamed Elsayed, Homayoon Farrahi, Felix Dangel, A. Rupam Mahmood
ICML 2024 Target Networks and Over-Parameterization Stabilize Off-Policy Bootstrapping with Function Approximation Fengdi Che, Chenjun Xiao, Jincheng Mei, Bo Dai, Ramki Gummadi, Oscar A Ramirez, Christopher K Harris, A. Rupam Mahmood, Dale Schuurmans
ICML 2023 Correcting Discount-Factor Mismatch in On-Policy Policy Gradient Methods Fengdi Che, Gautham Vasan, A. Rupam Mahmood
UAI 2023 Loosely Consistent Emphatic Temporal-Difference Learning Jiamin He, Fengdi Che, Yi Wan, A. Rupam Mahmood
TMLR 2023 Memory-Efficient Reinforcement Learning with Value-Based Knowledge Consolidation Qingfeng Lan, Yangchen Pan, Jun Luo, A. Rupam Mahmood
NeurIPSW 2023 Utility-Based Perturbed Gradient Descent: An Optimizer for Continual Learning Mohamed Elsayed, A. Rupam Mahmood
JMLR 2022 Greedification Operators for Policy Optimization: Investigating Forward and Reverse KL Divergences Alan Chan, Hugo Silva, Sungsu Lim, Tadashi Kozuno, A. Rupam Mahmood, Martha White
NeurIPSW 2022 The Emphatic Approach to Average-Reward Policy Evaluation Jiamin He, Yi Wan, A. Rupam Mahmood
NeurIPSW 2022 The Emphatic Approach to Average-Reward Policy Evaluation Jiamin He, Yi Wan, A. Rupam Mahmood
IJCAI 2019 Autoregressive Policies for Continuous Control Deep Reinforcement Learning Dmytro Korenkevych, A. Rupam Mahmood, Gautham Vasan, James Bergstra
CoRL 2018 Benchmarking Reinforcement Learning Algorithms on Real-World Robots A. Rupam Mahmood, Dmytro Korenkevych, Gautham Vasan, William Ma, James Bergstra
JMLR 2018 On Generalized Bellman Equations and Temporal-Difference Learning Huizhen Yu, A. Rupam Mahmood, Richard S. Sutton
JMLR 2016 An Emphatic Approach to the Problem of Off-Policy Temporal-Difference Learning Richard S. Sutton, A. Rupam Mahmood, Martha White
JMLR 2016 True Online Temporal-Difference Learning Harm van Seijen, A. Rupam Mahmood, Patrick M. Pilarski, Marlos C. Machado, Richard S. Sutton
NeurIPS 2014 Weighted Importance Sampling for Off-Policy Learning with Linear Function Approximation A. Rupam Mahmood, Hado P van Hasselt, Richard S. Sutton