Wang, Zhaoran
143 publications
AISTATS
2025
What and How Does In-Context Learning Learn? Bayesian Model Averaging, Parameterization, and Generalization
NeurIPS
2024
Provably Mitigating Overoptimization in RLHF: Your SFT Loss Is Implicitly an Adversarial Regularizer
ICMLW
2024
Provably Mitigating Overoptimization in RLHF: Your SFT Loss Is Implicitly an Adversarial Regularizer
JMLR
2023
Double Duality: Variational Primal-Dual Policy Optimization for Constrained Reinforcement Learning
AISTATS
2023
Finding Regularized Competitive Equilibria of Heterogeneous Agent Macroeconomic Models via Reinforcement Learning
NeurIPS
2023
Maximize to Explore: One Objective Function Fusing Estimation, Planning, and Exploration
NeurIPS
2023
Model-Based Reparameterization Policy Gradient Methods: Theory and Practical Algorithms
NeurIPS
2022
FinRL-Meta: Market Environments and Benchmarks for Data-Driven Financial Reinforcement Learning
NeurIPS
2022
Inducing Equilibria via Incentives: Simultaneous Design-and-Play Ensures Global Convergence
ICML
2022
Pessimistic Minimax Value Iteration: Provably Efficient Equilibrium Learning from Offline Datasets
ICLRW
2022
Pessimistic Minimax Value Iteration: Provably Efficient Equilibrium Learning from Offline Datasets
ICML
2022
Provably Efficient Offline Reinforcement Learning for Partially Observable Markov Decision Processes
NeurIPS
2022
Relational Reasoning via Set Transformers: Provable Efficiency and Applications to MARL
ICML
2022
Welfare Maximization in Competitive Equilibrium: Reinforcement Learning for Markov Exchange Economy
AISTATS
2021
Provably Efficient Actor-Critic for Risk-Sensitive and Robust Adversarial RL: A Linear-Quadratic Case
NeurIPSW
2021
ElegantRL-Podracer: Scalable and Elastic Library for Cloud-Native Deep Reinforcement Learning
NeurIPS
2021
Exponential Bellman Equation and Improved Regret Bounds for Risk-Sensitive Reinforcement Learning
NeurIPS
2021
Offline Constrained Multi-Objective Reinforcement Learning via Pessimistic Dual Value Iteration
ICML
2021
On Reward-Free RL with Kernel and Neural Function Approximations: Single-Agent MDP and Markov Game
ICML
2021
Randomized Exploration in Reinforcement Learning with General Value Function Approximation
NeurIPS
2021
Wasserstein Flow Meets Replicator Dynamics: A Mean-Field Analysis of Representation Learning in Actor-Critic
ICML
2020
Computational and Statistical Tradeoffs in Inferring Combinatorial Structures of Ising Model
NeurIPS
2020
Provably Efficient Neural Estimation of Structural Equation Models: An Adversarial Approach
NeurIPS
2020
Provably Efficient Reinforcement Learning with Kernel and Neural Function Approximations
ICLR
2019
Off-Policy Evaluation and Learning from Logged Bandit Feedback: Error Reduction via Surrogate Policy
NeurIPS
2019
Provably Global Convergence of Actor-Critic: A Case for Linear Quadratic Regulator with Ergodic Cost
AISTATS
2018
Nonlinear Structured Signal Estimation in High Dimensions via Iterative Hard Thresholding
NeurIPS
2016
More Supervision, Less Computation: Statistical-Computational Tradeoffs in Weakly Supervised Learning
NeurIPS
2016
NESTT: A Nonconvex Primal-Dual Splitting Method for Distributed and Stochastic Optimization