Zanette, Andrea

25 publications

ICML 2025 Accelerating Unbiased LLM Evaluation via Synthetic Feedback Zhaoyi Zhou, Yuda Song, Andrea Zanette
ICLRW 2025 Accelerating Unbiased LLM Evaluation via Synthetic Feedback Zhaoyi Zhou, Yuda Song, Andrea Zanette
NeurIPS 2025 Training Language Models to Reason Efficiently Daman Arora, Andrea Zanette
ICMLW 2024 Accelerating Best-of-N via Speculative Rejection Ruiqi Zhang, Momin Haider, Ming Yin, Jiahao Qiu, Mengdi Wang, Peter Bartlett, Andrea Zanette
ICMLW 2024 Accelerating Best-of-N via Speculative Rejection Ruiqi Zhang, Momin Haider, Ming Yin, Jiahao Qiu, Mengdi Wang, Peter Bartlett, Andrea Zanette
ICMLW 2024 Accelerating Best-of-N via Speculative Rejection Ruiqi Zhang, Momin Haider, Ming Yin, Jiahao Qiu, Mengdi Wang, Peter Bartlett, Andrea Zanette
ICML 2024 ArCHer: Training Language Model Agents via Hierarchical Multi-Turn RL Yifei Zhou, Andrea Zanette, Jiayi Pan, Sergey Levine, Aviral Kumar
ICLRW 2024 ArCHer: Training Language Model Agents via Hierarchical Multi-Turn RL Yifei Zhou, Andrea Zanette, Jiayi Pan, Aviral Kumar, Sergey Levine
NeurIPS 2024 Fast Best-of-N Decoding via Speculative Rejection Hanshi Sun, Momin Haider, Ruiqi Zhang, Huitao Yang, Jiahao Qiu, Ming Yin, Mengdi Wang, Peter L. Bartlett, Andrea Zanette
NeurIPS 2023 Policy Finetuning in Reinforcement Learning via Design of Experiments Using Offline Data Ruiqi Zhang, Andrea Zanette
ICML 2023 When Is Realizability Sufficient for Off-Policy Reinforcement Learning? Andrea Zanette
NeurIPS 2022 Bellman Residual Orthogonalization for Offline Reinforcement Learning Andrea Zanette, Martin J Wainwright
ICML 2022 Stabilizing Q-Learning with Linear Architectures for Provable Efficient Learning Andrea Zanette, Martin Wainwright
COLT 2021 Cautiously Optimistic Policy Optimization and Exploration with Linear Function Approximation Andrea Zanette, Ching-An Cheng, Alekh Agarwal
NeurIPS 2021 Design of Experiments for Stochastic Contextual Linear Bandits Andrea Zanette, Kefan Dong, Jonathan N Lee, Emma Brunskill
ICML 2021 Exponential Lower Bounds for Batch Reinforcement Learning: Batch RL Can Be Exponentially Harder than Online RL Andrea Zanette
NeurIPS 2021 Provable Benefits of Actor-Critic Methods for Offline Reinforcement Learning Andrea Zanette, Martin J Wainwright, Emma Brunskill
AISTATS 2020 Frequentist Regret Bounds for Randomized Least-Squares Value Iteration Andrea Zanette, David Brandfonbrener, Emma Brunskill, Matteo Pirotta, Alessandro Lazaric
ICML 2020 Learning near Optimal Policies with Low Inherent Bellman Error Andrea Zanette, Alessandro Lazaric, Mykel Kochenderfer, Emma Brunskill
NeurIPS 2020 Provably Efficient Reward-Agnostic Navigation with Linear Value Iteration Andrea Zanette, Alessandro Lazaric, Mykel J Kochenderfer, Emma Brunskill
NeurIPS 2019 Almost Horizon-Free Structure-Aware Best Policy Identification with a Generative Model Andrea Zanette, Mykel J Kochenderfer, Emma Brunskill
NeurIPS 2019 Limiting Extrapolation in Linear Approximate Value Iteration Andrea Zanette, Alessandro Lazaric, Mykel J Kochenderfer, Emma Brunskill
ICML 2019 Tighter Problem-Dependent Regret Bounds in Reinforcement Learning Without Domain Knowledge Using Value Function Bounds Andrea Zanette, Emma Brunskill
ICML 2018 Problem Dependent Reinforcement Learning Bounds Which Can Identify Bandit Structure in MDPs Andrea Zanette, Emma Brunskill
ECML-PKDD 2018 Robust Super-Level Set Estimation Using Gaussian Processes Andrea Zanette, Junzi Zhang, Mykel J. Kochenderfer