Yao, Jihan

2 publications

ICLR 2025 POTEC: Off-Policy Contextual Bandits for Large Action Spaces via Policy Decomposition Yuta Saito, Jihan Yao, Thorsten Joachims
ICLR 2025 Varying Shades of Wrong: Aligning LLMs with Wrong Answers Only Jihan Yao, Wenxuan Ding, Shangbin Feng, Lucy Lu Wang, Yulia Tsvetkov