Gupta, Dhawal

9 publications

AAAI 2024 From past to Future: Rethinking Eligibility Traces Dhawal Gupta, Scott M. Jordan, Shreyas Chaudhari, Bo Liu, Philip S. Thomas, Bruno Castro da Silva
NeurIPSW 2024 P3O: Pessimistic Preference-Based Policy Optimization for Robust Alignment from Preferences Dhawal Gupta, Christoph Dann, Alekh Agarwal
NeurIPSW 2024 The Agent-Environment Boundary Dhawal Gupta
ICLR 2023 A Mixture-of-Expert Approach to RL-Based Dialogue Management Yinlam Chow, Azamat Tulepbergenov, Ofir Nachum, Dhawal Gupta, Moonkyung Ryu, Mohammad Ghavamzadeh, Craig Boutilier
NeurIPS 2023 Behavior Alignment via Reward Function Optimization Dhawal Gupta, Yash Chandak, Scott Jordan, Philip S. Thomas, Bruno C. da Silva
NeurIPS 2023 Offline Reinforcement Learning for Mixture-of-Expert Dialogue Management Dhawal Gupta, Yinlam Chow, Azamat Tulepbergenov, Mohammad Ghavamzadeh, Craig Boutilier
NeurIPSW 2022 A Mixture-of-Expert Approach to RL-Based Dialogue Management Yinlam Chow, Azamat Tulepbergenov, Ofir Nachum, Dhawal Gupta, Moonkyung Ryu, Mohammad Ghavamzadeh, Craig Boutilier
NeurIPS 2021 Structural Credit Assignment in Neural Networks Using Reinforcement Learning Dhawal Gupta, Gabor Mihucz, Matthew Schlegel, James Kostas, Philip S. Thomas, Martha White
ICML 2020 Gradient Temporal-Difference Learning with Regularized Corrections Sina Ghiassian, Andrew Patterson, Shivam Garg, Dhawal Gupta, Adam White, Martha White