Bhatnagar, Shalabh

21 publications

ICCV 2025 One Encoder to Rule Them All: Representation Learning for Model-Free Visual Reinforcement Learning Using Fourier Neural Operators Parag Dutta, Mohd Ayyoob, Shalabh Bhatnagar, Ambedkar Dukkipati
AAAI 2025 Two-Timescale Critic-Actor for Average Reward MDPs with Function Approximation Prashansa Panda, Shalabh Bhatnagar
TMLR 2025 Variance Reduced Smoothed Functional REINFORCE Policy Gradient Algorithms Shalabh Bhatnagar, H R Deepak
AISTATS 2024 A Cubic-Regularized Policy Newton Algorithm for Reinforcement Learning Mizhaan P. Maniyar, Prashanth L.A., Akash Mondal, Shalabh Bhatnagar
UAI 2024 Finite-Time Analysis of Three-Timescale Constrained Actor-Critic and Constrained Natural Actor-Critic Algorithms. Prashansa Panda, Shalabh Bhatnagar
ICML 2023 Off-Policy Average Reward Actor-Critic with Deterministic Policy Search Naman Saxena, Subhojyoti Khastagir, Shishir Kolathaya, Shalabh Bhatnagar
AAAI 2022 Gradient Temporal Difference with Momentum: Stability and Convergence Rohan Deb, Shalabh Bhatnagar
NeurIPS 2022 Model-Based Safe Deep Reinforcement Learning via a Constrained Proximal Policy Optimization Algorithm Ashish K Jayant, Shalabh Bhatnagar
NeurIPSW 2021 Dynamic Mirror Descent Based Model Predictive Control for Accelerating Robot Learning Utkarsh Aashu Mishra, Soumya Rani Samineni, Prakhar Goel, Chandravaran Venkatasai Kunjeti, Himanshu Lodha, Aman Singh, Aditya Verma Sagi, Shalabh Bhatnagar, N Y Shishir
AAAI 2020 Hierarchical Average Reward Policy Gradient Algorithms (Student Abstract) Akshay Dharmavaram, Matthew Riemer, Shalabh Bhatnagar
CoRL 2020 Robust Quadrupedal Locomotion on Sloped Terrains: A Linear Policy Approach Kartik Paigwar, Lokesh Krishna, Sashank Tirumala, Naman Khetan, Aditya Varma, Ashish Joglekar, Shalabh Bhatnagar, Ashitava Ghosal, Bharadwaj Amrutur, Shishir Kolathaya
MLJ 2018 An Incremental Off-Policy Search in a Model-Free Markov Decision Process Using a Single Sample Path Ajin George Joseph, Shalabh Bhatnagar
MLJ 2018 An Online Prediction Algorithm for Reinforcement Learning with Linear Function Approximation Using Cross Entropy Method Ajin George Joseph, Shalabh Bhatnagar
AAAI 2015 A Generalized Reduced Linear Program for Markov Decision Processes Chandrashekar Lakshminarayanan, Shalabh Bhatnagar
NeurIPS 2014 Universal Option Models Hengshuai Yao, Csaba Szepesvari, Richard S. Sutton, Joseph Modayil, Shalabh Bhatnagar
ICML 2010 Toward Off-Policy Learning Control with Function Approximation Hamid Reza Maei, Csaba Szepesvári, Shalabh Bhatnagar, Richard S. Sutton
NeurIPS 2009 Convergent Temporal-Difference Learning with Arbitrary Smooth Function Approximation Hamid R. Maei, Csaba Szepesvári, Shalabh Bhatnagar, Doina Precup, David Silver, Richard S. Sutton
ICML 2009 Fast Gradient-Descent Methods for Temporal-Difference Learning with Linear Function Approximation Richard S. Sutton, Hamid Reza Maei, Doina Precup, Shalabh Bhatnagar, David Silver, Csaba Szepesvári, Eric Wiewiora
NeurIPS 2009 Multi-Step Dyna Planning for Policy Evaluation and Control Hengshuai Yao, Shalabh Bhatnagar, Dongcui Diao, Richard S. Sutton, Csaba Szepesvári
NeurIPS 2007 Incremental Natural Actor-Critic Algorithms Shalabh Bhatnagar, Mohammad Ghavamzadeh, Mark Lee, Richard S. Sutton
JMLR 2006 A Simulation-Based Algorithm for Ergodic Control of Markov Chains Conditioned on Rare Events Shalabh Bhatnagar, Vivek S. Borkar, Madhukar Akarapu