Pang, Richard Yuanzhe

7 publications

ICML 2025 Self-Consistency Preference Optimization Archiki Prasad, Weizhe Yuan, Richard Yuanzhe Pang, Jing Xu, Maryam Fazel-Zarandi, Mohit Bansal, Sainbayar Sukhbaatar, Jason E Weston, Jane Yu
ICLR 2025 Transformers Struggle to Learn to Search Abulhair Saparov, Srushti Ajay Pawar, Shreyas Pimpalgaonkar, Nitish Joshi, Richard Yuanzhe Pang, Vishakh Padmakumar, Mehran Kazemi, Najoung Kim, He He
NeurIPS 2024 Iterative Reasoning Preference Optimization Richard Yuanzhe Pang, Weizhe Yuan, Kyunghyun Cho, He He, Sainbayar Sukhbaatar, Jason Weston
ICML 2024 Self-Rewarding Language Models Weizhe Yuan, Richard Yuanzhe Pang, Kyunghyun Cho, Xian Li, Sainbayar Sukhbaatar, Jing Xu, Jason E Weston
ICML 2023 Extrapolative Controlled Sequence Generation via Iterative Refinement Vishakh Padmakumar, Richard Yuanzhe Pang, He He, Ankur P Parikh
NeurIPS 2023 Testing the General Deductive Reasoning Capacity of Large Language Models Using OOD Examples Abulhair Saparov, Richard Yuanzhe Pang, Vishakh Padmakumar, Nitish Joshi, Mehran Kazemi, Najoung Kim, He He
ICLR 2021 Text Generation by Learning from Demonstrations Richard Yuanzhe Pang, He He