Rashidinejad, Paria
8 publications
ICLR
2025
Sail into the Headwind: Alignment via Robust Rewards and Dynamic Labels Against Reward Hacking
NeurIPS
2023
Importance Weighted Actor-Critic for Optimal Conservative Offline Reinforcement Learning
8 publications