Zhang, Lunjun

13 publications

TMLR 2025 D2 Actor Critic: Diffusion Actor Meets Distributional Critic Lunjun Zhang, Shuo Han, Hanrui Lyu, Bradly C. Stadie
ICLR 2025 Generative Verifiers: Reward Modeling as Next-Token Prediction Lunjun Zhang, Arian Hosseini, Hritik Bansal, Mehran Kazemi, Aviral Kumar, Rishabh Agarwal
NeurIPS 2025 Thinking vs. Doing: Improving Agent Reasoning by Scaling Test-Time Interaction Junhong Shen, Hao Bai, Lunjun Zhang, Yifei Zhou, Amrith Setlur, Shengbang Tong, Diego Caples, Nan Jiang, Tong Zhang, Ameet Talwalkar, Aviral Kumar
ICLR 2024 Copilot4D: Learning Unsupervised World Models for Autonomous Driving via Discrete Diffusion Lunjun Zhang, Yuwen Xiong, Ze Yang, Sergio Casas, Rui Hu, Raquel Urtasun
NeurIPSW 2024 Generative Verifiers: Reward Modeling as Next-Token Prediction Lunjun Zhang, Arian Hosseini, Hritik Bansal, Mehran Kazemi, Aviral Kumar, Rishabh Agarwal
NeurIPSW 2024 Generative Verifiers: Reward Modeling as Next-Token Prediction Lunjun Zhang, Arian Hosseini, Hritik Bansal, Mehran Kazemi, Aviral Kumar, Rishabh Agarwal
ECCV 2024 Learning to Drive via Asymmetric Self-Play Chris Zhang, Sourav Biswas, Kelvin Wong, Kion Fallah, Lunjun Zhang, Dian Chen, Sergio Casas, Raquel Urtasun
CoRL 2023 Learning Realistic Traffic Agents in Closed-Loop Chris Zhang, James Tu, Lunjun Zhang, Kelvin Wong, Simon Suo, Raquel Urtasun
CVPR 2023 Towards Unsupervised Object Detection from LiDAR Point Clouds Lunjun Zhang, Anqi Joyce Yang, Yuwen Xiong, Sergio Casas, Bin Yang, Mengye Ren, Raquel Urtasun
NeurIPSW 2022 Understanding Hindsight Goal Relabeling Requires Rethinking Divergence Minimization Lunjun Zhang, Bradly C. Stadie
NeurIPSW 2022 Understanding Hindsight Goal Relabeling Requires Rethinking Divergence Minimization Lunjun Zhang, Bradly C. Stadie
ICML 2021 World Model as a Graph: Learning Latent Landmarks for Planning Lunjun Zhang, Ge Yang, Bradly C Stadie
UAI 2020 Learning Intrinsic Rewards as a Bi-Level Optimization Problem Bradly Stadie, Lunjun Zhang, Jimmy Ba