Ding, Yuyang

2 publications

ICLR 2026 FAPO: Flawed-Aware Policy Optimization for Efficient and Reliable Reasoning Yuyang Ding, Chi Zhang, Juntao Li, Haibin Lin, Xin Liu, Min Zhang
NeurIPS 2025 SCAN: Self-Denoising Monte Carlo Annotation for Robust Process Reward Learning Yuyang Ding, Xinyu Shi, Juntao Li, Xiaobo Liang, Zhaopeng Tu, Min Zhang