Lu, Yuxiao

3 publications

ICLR 2026 Discern Truth from Falsehood: Reducing Over-Refusal via Contrastive Refinement Yuxiao Lu, Lin Xu, Yang Sun, Wenjun Li, Jie Shi
ICLR 2025 Semantic Loss Guided Data Efficient Supervised Fine Tuning for Safe Responses in LLMs Yuxiao Lu, Arunesh Sinha, Pradeep Varakantham
AAAI 2024 Handling Long and Richly Constrained Tasks Through Constrained Hierarchical Reinforcement Learning Yuxiao Lu, Arunesh Sinha, Pradeep Varakantham