Mroueh, Youssef
47 publications
ICLR
2026
Revisiting Group Relative Policy Optimization: Insights into On-Policy and Off-Policy Training
NeurIPS
2025
KL-Regularized RLHF with Multiple Reference Models: Exact Solutions and Sample Complexity
NeurIPS
2024
Multivariate Stochastic Dominance via Optimal Transport and Applications to Models Benchmarking