Rafailov, Rafael

36 publications

ICML 2025 Collapse or Thrive: Perils and Promises of Synthetic Data in a Self-Generating World Joshua Kazdan, Rylan Schaeffer, Apratim Dey, Matthias Gerstgrasser, Rafael Rafailov, David L. Donoho, Sanmi Koyejo
ICLRW 2025 MALT: Improving Reasoning with Multi-Agent LLM Training Sumeet Ramesh Motwani, Chandler Smith, Rocktim Jyoti Das, Rafael Rafailov, Ivan Laptev, Philip Torr, Fabio Pizzati, Ronald Clark, Christian Schroeder de Witt
NeurIPS 2025 MJ-Bench: Is Your Multimodal Reward Model Really a Good Judge for Text-to-Image Generation? Zhaorun Chen, Zichen Wen, Yichao Du, Yiyang Zhou, Chenhang Cui, Siwei Han, Zhenzhen Weng, Chaoqi Wang, Zhengwei Tong, Leria Huang, Canyu Chen, Haoqin Tu, Qinghao Ye, Zhihong Zhu, Yuqing Zhang, Jiawei Zhou, Zhuokai Zhao, Rafael Rafailov, Chelsea Finn, Huaxiu Yao
NeurIPS 2025 MJ-Video: Benchmarking and Rewarding Video Generation with Fine-Grained Video Preference Haibo Tong, Zhaoyang Wang, Zhaorun Chen, Haonian Ji, Shi Qiu, Siwei Han, Kexin Geng, Zhongkai Xue, Yiyang Zhou, Peng Xia, Mingyu Ding, Rafael Rafailov, Chelsea Finn, Huaxiu Yao
TMLR 2025 Reliable and Responsible Foundation Models Xinyu Yang, Junlin Han, Rishi Bommasani, Jinqi Luo, Wenjie Qu, Wangchunshu Zhou, Adel Bibi, Xiyao Wang, Jaehong Yoon, Elias Stengel-Eskin, Shengbang Tong, Lingfeng Shen, Rafael Rafailov, Runjia Li, Zhaoyang Wang, Yiyang Zhou, Chenhang Cui, Yu Wang, Wenhao Zheng, Huichi Zhou, Jindong Gu, Zhaorun Chen, Peng Xia, Tony Lee, Thomas P Zollo, Vikash Sehwag, Jixuan Leng, Jiuhai Chen, Yuxin Wen, Huan Zhang, Zhun Deng, Linjun Zhang, Pavel Izmailov, Pang Wei Koh, Yulia Tsvetkov, Andrew Gordon Wilson, Jiaheng Zhang, James Zou, Cihang Xie, Hao Wang, Philip Torr, Julian McAuley, David Alvarez-Melis, Florian Tramèr, Kaidi Xu, Suman Jana, Chris Callison-Burch, Rene Vidal, Filippos Kokkinos, Mohit Bansal, Beidi Chen, Huaxiu Yao
NeurIPSW 2024 Accumulating Data Avoids Model Collapse Joshua Kazdan, Apratim Dey, Rylan Schaeffer, Matthias Gerstgrasser, Rafael Rafailov, David L. Donoho, Sanmi Koyejo
ICLRW 2024 Aligning Modalities in Vision Large Language Models via Preference Fine-Tuning Yiyang Zhou, Chenhang Cui, Rafael Rafailov, Chelsea Finn, Huaxiu Yao
ICLR 2024 An Emulator for Fine-Tuning Large Language Models Using Small Language Models Eric Mitchell, Rafael Rafailov, Archit Sharma, Chelsea Finn, Christopher D Manning
ICLR 2024 Contrastive Preference Learning: Learning from Human Feedback Without Reinforcement Learning Joey Hejna, Rafael Rafailov, Harshit Sikchi, Chelsea Finn, Scott Niekum, W. Bradley Knox, Dorsa Sadigh
CVPR 2024 Diffusion Model Alignment Using Direct Preference Optimization Bram Wallace, Meihua Dang, Rafael Rafailov, Linqi Zhou, Aaron Lou, Senthil Purushwalkam, Stefano Ermon, Caiming Xiong, Shafiq Joty, Nikhil Naik
L4DC 2024 Efficient Imitation Learning with Conservative World Models Victor Kolev, Rafael Rafailov, Kyle Hatch, Jiajun Wu, Chelsea Finn
ICMLW 2024 Is Model Collapse Inevitable? Breaking the Curse of Recursion by Accumulating Real and Synthetic Data Matthias Gerstgrasser, Rylan Schaeffer, Apratim Dey, Rafael Rafailov, Tomasz Korbak, Henry Sleight, Rajashree Agrawal, John Hughes, Dhruv Bhandarkar Pai, Andrey Gromov, Dan Roberts, Diyi Yang, David L. Donoho, Sanmi Koyejo
ICLR 2024 Language Model Detectors Are Easily Optimized Against Charlotte Nicks, Eric Mitchell, Rafael Rafailov, Archit Sharma, Christopher D Manning, Chelsea Finn, Stefano Ermon
ICMLW 2024 MJ-Bench: Is Your Multimodal Reward Model Really a Good Judge? Zhaorun Chen, Yichao Du, Zichen Wen, Yiyang Zhou, Chenhang Cui, Zhenzhen Weng, Haoqin Tu, Chaoqi Wang, Zhengwei Tong, Leria Huang, Canyu Chen, Qinghao Ye, Zhihong Zhu, Yuqing Zhang, Jiawei Zhou, Zhuokai Zhao, Rafael Rafailov, Chelsea Finn, Huaxiu Yao
CoRL 2024 OpenVLA: An Open-Source Vision-Language-Action Model Moo Jin Kim, Karl Pertsch, Siddharth Karamcheti, Ted Xiao, Ashwin Balakrishna, Suraj Nair, Rafael Rafailov, Ethan P Foster, Pannag R Sanketi, Quan Vuong, Thomas Kollar, Benjamin Burchfiel, Russ Tedrake, Dorsa Sadigh, Sergey Levine, Percy Liang, Chelsea Finn
ICML 2024 Preference Fine-Tuning of LLMs Should Leverage Suboptimal, On-Policy Data Fahim Tajwar, Anikait Singh, Archit Sharma, Rafael Rafailov, Jeff Schneider, Tengyang Xie, Stefano Ermon, Chelsea Finn, Aviral Kumar
NeurIPS 2024 Scaling Laws for Reward Model Overoptimization in Direct Alignment Algorithms Rafael Rafailov, Yaswanth Chittepu, Ryan Park, Harshit Sikchi, Joey Hejna, W. Bradley Knox, Chelsea Finn, Scott Niekum
ICMLW 2024 Scaling Laws for Reward Model Overoptimization in Direct Alignment Algorithms Rafael Rafailov, Yaswanth Chittepu, Ryan Park, Harshit Sikchi, Joey Hejna, W. Bradley Knox, Chelsea Finn, Scott Niekum
NeurIPS 2024 Self-Supervised Alignment with Mutual Information: Learning to Follow Principles Without Preference Labels Jan-Philipp Fränken, Eric Zelikman, Rafael Rafailov, Kanishk Gandhi, Tobias Gerstenberg, Noah D. Goodman
NeurIPSW 2023 An Emulator for Fine-Tuning Large Language Models Using Small Language Models Eric Mitchell, Rafael Rafailov, Archit Sharma, Chelsea Finn, Christopher Manning
L4DC 2023 Contrastive Example-Based Control Kyle Beltran Hatch, Benjamin Eysenbach, Rafael Rafailov, Tianhe Yu, Ruslan Salakhutdinov, Sergey Levine, Chelsea Finn
NeurIPS 2023 Direct Preference Optimization: Your Language Model Is Secretly a Reward Model Rafael Rafailov, Archit Sharma, Eric Mitchell, Christopher D Manning, Stefano Ermon, Chelsea Finn
ICMLW 2023 Direct Preference Optimization: Your Language Model Is Secretly a Reward Model Rafael Rafailov, Archit Sharma, Eric Mitchell, Stefano Ermon, Christopher D Manning, Chelsea Finn
NeurIPSW 2023 Language Model Detectors Are Easily Optimized Against Charlotte Nicks, Eric Mitchell, Rafael Rafailov, Archit Sharma, Christopher Manning, Chelsea Finn, Stefano Ermon
CoRL 2023 MOTO: Offline Pre-Training to Online Fine-Tuning for Model-Based Robot Learning Rafael Rafailov, Kyle Beltran Hatch, Victor Kolev, John D. Martin, Mariano Phielipp, Chelsea Finn
ICLRW 2023 MOTO: Offline to Online Fine-Tuning for Model-Based Reinforcement Learning Rafael Rafailov, Kyle Beltran Hatch, Victor Kolev, John D Martin, Mariano Phielipp, Chelsea Finn
ICLRW 2023 Model-Based Adversarial Imitation Learning as Online Fine-Tuning Rafael Rafailov, Victor Kolev, Kyle Beltran Hatch, John D Martin, Mariano Phielipp, Jiajun Wu, Chelsea Finn
NeurIPSW 2022 Contrastive Example-Based Control Kyle Beltran Hatch, Sarthak J Shetty, Benjamin Eysenbach, Tianhe Yu, Rafael Rafailov, Ruslan Salakhutdinov, Sergey Levine, Chelsea Finn
NeurIPSW 2022 Contrastive Example-Based Control Kyle Beltran Hatch, Sarthak J Shetty, Benjamin Eysenbach, Tianhe Yu, Rafael Rafailov, Ruslan Salakhutdinov, Sergey Levine, Chelsea Finn
ICLR 2022 Vision-Based Manipulators Need to Also See from Their Hands Kyle Hsu, Moo Jin Kim, Rafael Rafailov, Jiajun Wu, Chelsea Finn
NeurIPS 2021 COMBO: Conservative Offline Model-Based Policy Optimization Tianhe Yu, Aviral Kumar, Rafael Rafailov, Aravind Rajeswaran, Sergey Levine, Chelsea Finn
ICML 2021 Offline Meta-Reinforcement Learning with Advantage Weighting Eric Mitchell, Rafael Rafailov, Xue Bin Peng, Sergey Levine, Chelsea Finn
L4DC 2021 Offline Reinforcement Learning from Images with Latent Space Models Rafael Rafailov, Tianhe Yu, Aravind Rajeswaran, Chelsea Finn
NeurIPSW 2021 The Reflective Explorer: Online Meta-Exploration from Offline Data in Realistic Robotic Tasks Rafael Rafailov, Varun Kumar Vijay, Tianhe Yu, Avi Singh, Mariano Phielipp, Chelsea Finn
NeurIPS 2021 Visual Adversarial Imitation Learning Using Variational Models Rafael Rafailov, Tianhe Yu, Aravind Rajeswaran, Chelsea Finn
ICMLW 2021 Visual Adversarial Imitation Learning Using Variational Models Rafael Rafailov, Tianhe Yu, Aravind Rajeswaran, Chelsea Finn