Agarwal, Alekh
97 publications
NeurIPSW
2024
Conditional Language Policy: A General Framework for Steerable Multi-Objective Finetuning
NeurIPSW
2024
P3O: Pessimistic Preference-Based Policy Optimization for Robust Alignment from Preferences
NeurIPS
2024
Small Steps No More: Global Convergence of Stochastic Gradient Bandits for Arbitrary Learning Rates
ICML
2022
Efficient Reinforcement Learning in Block MDPs: A Model-Free Representation Learning Approach
NeurIPS
2022
Model-Based RL with Optimistic Posterior Sampling: Structural Conditions and Sample Complexity
COLT
2021
Cautiously Optimistic Policy Optimization and Exploration with Linear Function Approximation
JMLR
2021
On the Theory of Policy Gradient Methods: Optimality, Approximation, and Distribution Shift
NeurIPS
2019
Bias Correction of Learned Generative Models Using Likelihood-Free Importance Weighting
NeurIPS
2012
Stochastic Optimization and Sparse Statistical Recovery: Optimal Algorithms for High Dimensions
NeurIPS
2010
Fast Global Convergence Rates of Gradient Methods for High-Dimensional Statistical Recovery