Tao, Leitian

8 publications

ICLR 2026 Hybrid Reinforcement: When Reward Is Sparse, Better to Be Dense Leitian Tao, Ilia Kulikov, Swarnadeep Saha, Tianlu Wang, Jing Xu, Sharon Li, Jason E Weston, Ping Yu
ICLR 2026 RESTRAIN: From Spurious Votes to Signals — Self-Training RL with Self-Penalization Zhaoning Yu, Zhaolun Su, Leitian Tao, Haozhu Wang, Aashu Singh, Hanchao Yu, Jianyu Wang, Hongyang Gao, Weizhe Yuan, Jason E Weston, Ping Yu, Jing Xu
TMLR 2025 CodeLutra: Boosting LLM Code Generation via Preference-Guided Refinement Leitian Tao, Xiang Chen, Tong Yu, Tung Mai, Ryan A. Rossi, Yixuan Li, Saayan Mitra
NeurIPS 2025 Limited Preference Data? Learning Better Reward Model with Latent Space Synthesis Leitian Tao, Xuefeng Du, Sharon Li
ICML 2025 Position: Challenges and Future Directions of Data-Centric AI Alignment Min-Hsuan Yeh, Jeffrey Wang, Xuefeng Du, Seongheon Park, Leitian Tao, Shawn Im, Yixuan Li
ICLR 2025 Your Weak LLM Is Secretly a Strong Teacher for Alignment Leitian Tao, Yixuan Li
ICCV 2023 Activate and Reject: Towards Safe Domain Generalization Under Category Shift Chaoqi Chen, Luyao Tang, Leitian Tao, Hong-Yu Zhou, Yue Huang, Xiaoguang Han, Yizhou Yu
ICLR 2023 Non-Parametric Outlier Synthesis Leitian Tao, Xuefeng Du, Jerry Zhu, Yixuan Li