Mai, An

1 publications

NeurIPS 2025 Mitigating Reward Over-Optimization in Direct Alignment Algorithms with Importance Sampling Nguyen Minh Phuc, Ngoc-Hieu Nguyen, Duy Minh Ho Nguyen, Anji Liu, An Mai, Binh T. Nguyen, Daniel Sonntag, Khoa D Doan