Mondal, Washim Uddin
15 publications
NeurIPS
2025
Finite-Sample Analysis of Policy Evaluation for Robust Average Reward Reinforcement Learning
NeurIPS
2025
Global Convergence for Average Reward Constrained MDPs with Primal-Dual Actor Critic Algorithm
AISTATS
2025
Order-Optimal Regret with Novel Policy Gradient Approaches in Infinite-Horizon Average Reward MDPs