Mukherjee, Subhojyoti

16 publications

ICML 2025 Logits Are All We Need to Adapt Closed Models Gaurush Hiranandani, Haolun Wu, Subhojyoti Mukherjee, Sanmi Koyejo
NeurIPS 2025 Offline RL by Reward-Weighted Fine-Tuning for Conversation Optimization Subhojyoti Mukherjee, Viet Dac Lai, Raghavendra Addanki, Ryan A. Rossi, Seunghyun Yoon, Trung Bui, Anup Rao, Jayakumar Subramanian, Branislav Kveton
ICMLW 2024 Off-Policy Evaluation from Logged Human Feedback Aniruddha Bhargava, Lalit K Jain, Branislav Kveton, Ge Liu, Subhojyoti Mukherjee
ICMLW 2024 Optimal Design for Human Feedback Subhojyoti Mukherjee, Anusha Lalitha, Kousha Kalantari, Aniket Anand Deshmukh, Ge Liu, Yifei Ma, Branislav Kveton
NeurIPS 2024 Optimal Design for Human Preference Elicitation Subhojyoti Mukherjee, Anusha Lalitha, Kousha Kalantari, Aniket Deshmukh, Ge Liu, Yifei Ma, Branislav Kveton
AISTATS 2024 SPEED: Experimental Design for Policy Evaluation in Linear Heteroscedastic Bandits Subhojyoti Mukherjee, Qiaomin Xie, Josiah P Hanna, Robert Nowak
ICML 2024 SaVeR: Optimal Data Collection Strategy for Safe Policy Evaluation in Tabular MDP Subhojyoti Mukherjee, Josiah P. Hanna, Robert D Nowak
NeurIPS 2023 Multi-Task Representation Learning for Pure Exploration in Bilinear Bandits Subhojyoti Mukherjee, Qiaomin Xie, Josiah Hanna, Robert Nowak
ICMLW 2023 SPEED: Experimental Design for Policy Evaluation in Linear Heteroscedastic Bandits Subhojyoti Mukherjee, Qiaomin Xie, Josiah P. Hanna, Robert D Nowak
AISTATS 2022 Chernoff Sampling for Active Testing and Extension to Active Regression Subhojyoti Mukherjee, Ardhendu S. Tripathy, Robert Nowak
AISTATS 2022 Nearly Optimal Algorithms for Level Set Estimation Blake Mason, Lalit Jain, Subhojyoti Mukherjee, Romain Camilleri, Kevin Jamieson, Robert Nowak
UAI 2022 ReVar: Strengthening Policy Evaluation via Reduced Variance Sampling Subhojyoti Mukherjee, Josiah P. Hanna, Robert D Nowak
UAI 2022 Safety Aware Changepoint Detection for Piecewise I.i.d. Bandits Subhojyoti Mukherjee
ICMLW 2019 Distribution-Dependent and Time-Uniform Bounds for Piecewise I.i.d Bandits Subhojyoti Mukherjee, Odalric Maillard
AAAI 2018 Efficient-UCBV: An Almost Optimal Algorithm Using Variance Estimates Subhojyoti Mukherjee, K. P. Naveen, Nandan Sudarsanam, Balaraman Ravindran
IJCAI 2017 Thresholding Bandits with Augmented UCB Subhojyoti Mukherjee, Kolar Purushothama Naveen, Nandan Sudarsanam, Balaraman Ravindran