Phatale, Samrat

2 publications

AAAI 2025 Robust Multi-Objective Preference Alignment with Online DPO Raghav Gupta, Ryan Sullivan, Yunxuan Li, Samrat Phatale, Abhinav Rastogi
ICML 2024 RLAIF vs. RLHF: Scaling Reinforcement Learning from Human Feedback with AI Feedback Harrison Lee, Samrat Phatale, Hassan Mansoor, Thomas Mesnard, Johan Ferret, Kellie Ren Lu, Colton Bishop, Ethan Hall, Victor Carbune, Abhinav Rastogi, Sushant Prakash